Towards a Constructive Version of Banaszczyk's Vector Balancing Theorem

Authors Daniel Dadush, Shashwat Garg, Shachar Lovett, Aleksandar Nikolov,
Plaintext
                           T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58
                                        www.theoryofcomputing.org

                           S PECIAL ISSUE : APPROX-RANDOM 2016



        Towards a Constructive Version of
      Banaszczyk’s Vector Balancing Theorem
                 Daniel Dadush∗                     Shashwat Garg†                Shachar Lovett‡
                                                Aleksandar Nikolov§
              Received December 11, 2016; Revised October 8, 2019; Published December 13, 2019




       Abstract: An important theorem of Banaszczyk (Random Structures & Algorithms 1998)
       states that for any sequence of vectors of `2 norm at most 1/5 and any convex body K
       of Gaussian measure 1/2 in Rn , there exists a signed combination of these vectors which
       lands inside K. A major open problem is to devise a constructive version of Banaszczyk’s
       vector balancing theorem, i. e., to find an efficient algorithm which constructs the signed
       combination.
           We make progress towards this goal along several fronts. As our first contribution, we
       show an equivalence between Banaszczyk’s theorem and the existence of O(1)-subgaussian
       distributions over signed combinations. For the case of symmetric convex bodies, our
       equivalence implies the existence of a universal signing algorithm (i. e., independent of the
    An extended abstract of this paper appeared in the Proceedings of the 20th International Workshop on Randomization and
Computation, 2016 .
  ∗ Supported by the NWO Veni grant 639.071.510.
  † Supported by the Netherlands Organisation for Scientific Research (NWO) under project no. 022.005.025.
  ‡ Supported by an NSF CAREER award 1614023.
  § Supported by an NSERC Discovery grant, application number RGPIN-2016-06333.



ACM Classification: F.2.2
AMS Classification: 68W20
Key words and phrases: discrepancy, vector balancing, convex geometry


© 2019 Daniel Dadush, Shashwat Garg, Shachar Lovett, and Aleksandar Nikolov
c b Licensed under a Creative Commons Attribution License (CC-BY)                           DOI: 10.4086/toc.2019.v015a015
         DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

      body), which simply samples from the subgaussian sign distribution and checks to see if the
      associated combination lands inside the body. For asymmetric convex bodies, we provide
      a novel recentering procedure, which allows us to reduce to the case where the body is
      symmetric.
          As our second main contribution, we show that the above framework can be efficiently
                                                      √
      implemented when the vectors have length O(1/ log n), recovering Banaszczyk’s results un-
      der this stronger assumption. More precisely, we use random walk techniques to produce the
                                                                                        √
      required O(1)-subgaussian signing distributions when the vectors have length O(1/ log n),
      and use a stochastic gradient ascent method to implement the recentering procedure for
      asymmetric bodies.


1    Introduction
Given a family of sets S1 , . . . , Sm over a universe U = [n], the goal of combinatorial discrepancy mini-
mization is to find a bi-coloring χ : U → {−1, 1} such that the discrepancy, i. e., the maximum imbalance
max j∈[m] | ∑i∈S j χ(i)|, is made as small as possible. Discrepancy theory, where discrepancy minimization
plays a major role, has a rich history of applications in computer science as well as mathematics, and we
refer the reader to [24, 12, 13] for a general exposition.
    A beautiful question regards the discrepancy of sparse set systems, i. e., set systems in which each
element appears in at most t sets. A classical theorem√      of Beck and Fiala [9] gives an upper bound of
2t − 1 in this setting. They also conjectured an O( t) bound, which if true would be tight. An improved
Beck–Fiala bound of 2t − log∗ t was given by Bukh [11], where log∗ t is the iterated logarithm function
                                                                                   √
in base 2. Recently, it was shown by Ezra and Lovett [17] that a bound of O( t logt) holds with high
probability when m ≥ n and each element is assigned to t sets uniformly at random. The best general
bounds having sublinear dependence in t currently√depend on n or m. Srinivasan [33] used Beck’s
partial coloring method [8] to give a bound of O( t log min {n, m}). Using techniques from convex
geometry,
   p        Banaszczyk [2] proved a general result on vector balancing (stated below) which implies an
O( t log min {n, m}) bound.
    The proofs of both Srinivasan’s and Banaszczyk’s bounds were non-constructive, that is, they provided
no efficient algorithm to construct the guaranteed colorings, short of exhaustive enumeration. In the
last six years, tremendous progress has been made on the question of matching classical discrepancy
bounds algorithmically. Currently, essentially all discrepancy bounds proved using the partial coloring
method, including Srinivasan’s, have been made constructive [4, 23, 19, 30, 16]. Constructive versions of
Banaszczyk’s result have, however, proven elusive until very recently. In recent work [5], the first and
second named authors jointly with Bansal gave a constructive algorithm for recovering Banaszczyk’s
bound in the Beck–Fiala setting as well as the more general Komlós setting. An alternate algorithm via
multiplicative weight updates was also given recently in [21]. However, finding a constructive version of
Banaszczyk’s more general vector balancing theorem, which has further applications in approximating
hereditary discrepancy, remained an open problem until after an extended abstract [14] of the present
paper was published. This theorem is stated as follows:
Theorem 1.1 (Banaszczyk [2]). Let v1 , . . . , vn ∈ Rm satisfy kvi k2 ≤ 1/5. Then for any convex body
K ⊆ Rm of Gaussian measure at least 1/2, there exists χ ∈ {−1, 1}n such that ∑ni=1 χi vi ∈ K.

                       T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                             2
        T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

     In the theorem statement above, and in the rest of the paper we use “Gaussian measure” to refer to
the standard Gaussian measure. The lower bound 1/2 on the Gaussian measure of K is easily seen to
be tight. In particular, if all the vectors are equal to 0, we must have that 0 ∈ K. If we allow Gaussian
measure < 1/2, then K = {x ∈ Rn : x1 ≥ ε}, for ε > 0 small enough, is a clear counterexample. On the
other hand, it is not hard to see that if K has Gaussian measure 1/2 then 0 ∈ K. Otherwise, there exists a
halfspace H containing K but not 0, where H clearly has Gaussian measure less than 1/2.
     Banaszczyk’s theorem gives the best known bound for the notorious Komlós conjecture [32], a
generalization of the Beck–Fiala conjecture, which states that for any sequence of vectors v1 , . . . , vn ∈ Rm
of `2 norm at most 1, there exists χ ∈ {−1, 1}n such that k ∑ni=1 χi vi k∞ is a constant independent of m and
                                                                      √                       √
n. In this context, Banaszczyk’s theorem gives a bound of O( log m), because an O( log m) scaling
of the unit ball of `m
                     ∞ has Gaussian measure 1/2. Banaszczyk’s theorem together with estimates on the
Gaussian measure of slices of the `m ∞ ball due to Barthe, Guedon, Mendelson, and Naor [7] give a bound of
   √
O( log d), where d ≤ min{m, n} is the dimension of the span of v1 , . . . , vn . A well-known
                                                                                            p reduction (see,
e. g., Lecture 9 in [32]), shows that this bound for the Komlós problem implies an O( t log min{m, n})
bound in the Beck—Fiala setting.
     While the above results only deal with the case of K being a cube, Banaszczyk’s theorem has also
been applied to other cases. It was used in [3] to give the best known bound on the Steinitz conjecture. In
this problem, the input is a set of vectors v1 , . . . , vn in Rm of norm at most one and summing to 0. The
aim is to find a permutation π : [n] → [n] to minimise the maximum sum prefix of the vectors rearranged
according to π i. e., to minimize
                                                       k
                                               max    ∑ vπ(i) .
                                               k∈[n] i=1

                                                                   √
The Steinitz conjecture is that this bound should always be O( m), irrespective of the number of vectors,
                                                                                  √     √
and using the vector balancing theorem Banaszczyk proved a bound of O( m + log n) for the `2 norm.
     More recently, Banaszczyk’s theorem was applied to more general symmetric polytopes in Nikolov
and Talwar’s approximation algorithm [28] for a hereditary notion of discrepancy. Hereditary discrepancy
is defined as the maximum discrepancy of any restriction of the set system to a subset of the universe.
In [28] it was shown that an efficiently computable quantity, denoted γ2 , bounds hereditary discrepancy
from above and from below for any given set system, up to polylogarithmic factors. For the upper
bound they used Banaszczyk’s theorem for a natural polytope associated with the set system. However,
without an algorithmic version of of Banaszczyk’s theorem for a general body, it is not clear how to
efficiently compute colorings that achieve the discrepancy upper bounds in terms of γ2 . The recent work
on algorithmic bounds in the Komlós setting [5] does not address this more general problem.
     Banaszczyk’s proof of Theorem 1.1 follows an ingenious induction argument, which folds the effect
of choosing the sign of vn into the body K. The first observation is that finding a point of the set
∑ni=1 {−vi , vi } inside K is equivalent to finding a point of ∑n−1
                                                                i=1 {−vi , vi } in K − vn ∪ K + vn . Inducting on
this set is not immediately possible because it may no longer be convex. Instead, Banaszczyk shows that
there exists a convex subset K ∗ vn of (K − vn ) ∪ (K + vn ) with Gaussian measure at least that of K, as
long as K has measure at least 1/2, which allows induction on K ∗ vn . In the base case, one needs to show
that a convex body of Gaussian measure at least 1/2 must contain the origin, but this fact follows easily
from the hyperplane separation theorem, as indicated above. While extremely elegant, Banaszczyk’s

                        T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                  3
          DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

proof can be seen as relatively mysterious, as it does not seem to provide any tangible insights as to what
the colorings look like.


1.1   Our results
As our main contribution, we help demystify Banaszczyk’s theorem, by showing that it is equivalent, up to
a constant factor in the length of the vectors, to the existence of certain subgaussian coloring distributions.
Using this equivalence, as our second
                                    p main contribution, we give an efficient algorithm that recovers
Banaszczyk’s theorem up to a O( log min{m, n}) factor for all convex bodies. This improves upon the
best previous algorithms of Rothvoss [30], Eldan and Singh [16], which only recover the theorem for
symmetric convex bodies up to a O(log min{m, n}) factor.
     As a major consequence of our equivalence, we show that for any sequence v1 , . . . , vn ∈ Rm of
sufficiently short vectors there exists a probability distribution χ ∈ {−1, 1}n over colorings such that,
for any symmetric convex body K ⊆ Rm of Gaussian measure at least 1/2, the random variable ∑ni=1 χi vi
lands inside K with probability at least 1/2. Importantly, if such a distribution can be efficiently sampled,
we immediately get a universal sampler for constructing Banaszczyk colorings for all symmetric convex
bodies (we remark that the recent work of [5] constructs a more restricted form of such distributions).
Using random walk techniques, we show how to implement an approximate version                pof this sampler
efficiently, which guarantees the same conclusion when the vectors are of length O(1/ log min {m, n}).
We provide more details on these results in Section 1.1.1 and Section 1.1.2.
     To extend our results to asymmetric convex bodies, we develop a novel recentering procedure and a
corresponding efficient implementation which allows us to reduce the asymmetric setting to the symmetric
one. After this reduction, a slight extension of the aforementioned sampler again yields the desired
colorings. We note that our recentering procedure in fact depends on the target convex body, and hence our
algorithms are no longer universal in this setting. We provide more details on these results in Section 1.1.3
and Section 1.1.4.
     Interestingly, we additionally show that this procedure can be extended p to yield a completely different
coloring algorithm, i. e., not using the sampler, achieving the same O( log min {m, n}) approximation
factor. Surprisingly, the coloring outputted by this procedure is essentially deterministic and has a natural
analytic description, which may be of independent interest.
     Before we continue with a more detailed description on our results, we begin with some terminology
and a well-known reduction. Given a set of vectors v1 , . . . , vn ∈ Rm , we shall call a property hereditary if
it holds for all subsets of the vectors. We note that Banaszczyk’s vector balancing bounds restricted to a
set of vectors are hereditary, since a bound on the maximum `2 norm of the vectors is hereditary. We
shall say that a property of colorings holds in the linear setting, if when given any shift
                                                      (                           )
                                     n                    n
                                                def                           n
                                t ∈ ∑ [−vi , vi ] =       ∑ λi vi : λ ∈ [−1, 1]       ,
                                    i=1               i=1


one can find a coloring (or distribution on colorings) χ ∈ {−1, 1}n such that ∑ni=1 χi vi − t satisfies the
property. It is well-known that Banaszczyk’s theorem also extends by standard arguments to the linear
setting after reducing the `2 norm bound from 1/5 to 1/10 (a factor 2 drop). This follows, for example,

                        T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                 4
        T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

from the general inequality between hereditary and linear discrepancy proved by Lovász, Spencer, and
Vesztergombi [22].
    All the results in this work will in fact hold in the linear setting. When treating the linear setting, it is
well known that one can always reduce to the case where the vectors v1 , . . . , vn are linearly independent,
and in our setting, when m = n. In particular, assume we are given some shift t ∈ ∑ni=1 [−vi , vi ] and
that v1 , . . . , vn are not linearly independent. Then, using a standard linear algebraic technique, we can
find a “fractional coloring” x ∈ [−1, 1]n such that ∑ni=1 xi vi = t, and the vectors (vi : i ∈ Ax ) are linearly
                          def
independent, where Ax = {i : xi ∈ (−1, 1)} is the set of fractional coordinates (see Lecture 5 in [32],
or Chapter 4 in [24]). We can think of this as a reduction to coloring the linearly independent vectors
indexed by Ax . Specifically, given x as above, define the lifting function Lx : [−1, 1]Ax → [−1, 1]n by
                                            (
                                             zi : i ∈ Ax ,
                                  Lx (z)i =                       ∀i ∈ [n] .                           (1.1)
                                             xi : i ∈ [n] \ Ax ,

This map takes any coloring χ ∈ {−1, 1}Ax and “lifts” it to a full coloring Lx (χ) ∈ {−1, 1}n . It also
satisfies the property that
                                 Lx (χ) − t = ∑ χi vi − ∑ xi vi .
                                                     i∈Ax       i∈Ax

So, if we can find a coloring χ ∈ {−1, 1}Ax such that

                                            ∑ χi vi − ∑ xi vi ∈ K,
                                           i∈Ax        i∈Ax

then we would have Lx (χ) − t ∈ K as well. Moreover, if we define W as the span of (vi : i ∈ Ax ), then

                 ∑ χi vi − ∑ xi vi ∈ K        if and only if      ∑ χi vi − ∑ xi vi ∈ K ∩W,
                i∈Ax        i∈Ax                                  i∈Ax       i∈Ax

so we can replace K with K ∩W , and work entirely inside W . For convex bodies K with Gaussian measure
at least 1/2, the central section K ∩W has Gaussian measure that is at least as large, so we have reduced
the problem to the case of |Ax | linearly independent vectors in an |Ax |-dimensional space. (See Section 2
for the full details.) We shall thus, for simplicity, state all our results in the setting where the vectors
v1 , . . . , vn are in Rn and are linearly independent.

1.1.1   Symmetric convex bodies and subgaussian distributions
In this section, we detail the equivalence of Banaszczyk’s theorem restricted to symmetric convex bodies
with the existence of certain subgaussian distributions. We begin with the main theorem of this section,
which we note holds in a more general setting than Banaszczyk’s result.

Theorem 1.2 (Main Equivalence). Let T ⊆ Rn be a finite set. Then, the following parameters are
equivalent up to a universal constant factor independent of T and n:

   1. The minimum sb > 0 such that for any symmetric convex body K ⊆ Rn of Gaussian measure at
      least 1/2, we have that T ∩ sb K 6= 0.
                                          /

                        T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                  5
           DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

   2. The minimum sg > 0 such that there exists an sg -subgaussian random variable Y supported on T .

    We recall that a random vector Y ∈ Rn is s-subgaussian, or subgaussian with parameter s, if for
                                                                         2
any unit vector θ ∈ Sn−1 and t ≥ 0, Pr[|hY, θ i| ≥ t] ≤ 2e−(t/s) /2 . In words, Y is subgaussian if all its
1-dimensional marginals satisfy the same tail bound as the 1-dimensional Gaussian of mean 0 and
standard deviation s.
    To apply the above to discrepancy, we set T = ∑ni=1 {−vi , vi }, i. e., all signed combinations of the
vectors v1 , . . . , vn ∈ Rn . In this context, Banaszczyk’s theorem directly implies that sb ≤ 5 maxi∈[n] kvi k2 ,
and hence by our equivalence that sg = O(1) maxi∈[n] kvi k2 . Furthermore, the above extends to the linear
setting letting T = ∑ni=1 {−vi , vi } − t, for t ∈ ∑ni=1 [−vi , vi ], because, as mentioned above, Banaszczyk’s
theorem extends to this setting as well.
    The existence of the universal sampler claimed in the previous section is in fact the proof that
sb = O(sg ) in the above Theorem. In particular, it follows directly from the following lemma.

Lemma 1.3. Let Y ∈ Rn be an s-subgaussian random variable. There exists an absolute constant c > 0,
such that for any symmetric convex body K ⊆ Rn of Gaussian measure at least 1/2, Pr[Y ∈ s · cK] ≥ 1/2.

     Here, if Y is the sg -subgaussian distribution supported on ∑ni=1 {−vi , vi } − t as above, we simply
let χ denote the random variable such that Y = ∑ni=1 χi vi − t. That χ now yields the desired universal
distribution on colorings is exactly the statement of the lemma.
     As a consequence of the above, we see that to recover Banaszczyk’s theorem for symmetric convex
bodies, it suffices to be able to efficiently sample from an O(1)-subgaussian distribution over sets of the
type ∑ni=1 {−vi , vi } − t, for t ∈ ∑ni=1 [−vi , vi ], when v1 , . . . , vn ∈ Rn are linearly independent and have `2
norm at most 1. Here we rely on homogeneity, that is, if Y is an s-subgaussian random variable supported
on ∑ni=1 {−vi , vi } − t then αY is αs-subgaussian on ∑ni=1 {−αvi , αvi } − αt, for α > 0.
     The proof of Lemma 1.3 (see Section 3 for more details) follows relatively directly from well-known
convex geometric estimates combined with Talagrand’s majorizing measures theorem, which gives a
powerful characterization of the supremum of any Gaussian process.
     Unfortunately, Lemma 1.3 does not hold for asymmetric convex bodies. In particular, if Y = −e1 , the
negated first standard basis vector, and K = {x ∈ Rn : x1 ≥ 0}, the conclusion is clearly false no matter
how much we scale K, even though Y is O(1)-subgaussian and K has Gaussian measure 1/2. One may
perhaps hope that the conclusion still holds if we ask for either Y or −Y to be in s · cK in the asymmetric
setting, though we do not know how to prove this. We note however that this only makes sense when the
support of Y is symmetric, which does not necessarily hold in the linear discrepancy setting.
     We now describe the high level idea of the proof for the reverse direction, namely, that sg = O(sb ).
For this purpose, we show that the existence of a O(sb )-subgaussian distribution on T can be expressed
as a two player zero-sum game, i. e., the first player chooses a distribution on T and the second player
tries to find a non-subgaussian direction. Here the value of the game will be small if and only if the
O(sb )-subgaussian distribution exists. To bound the value of the game, we show that an appropriate
“convexification” of the space of subgaussianity tests for the second player can be associated with
symmetric convex bodies of Gaussian measure at least 1/2. From here, we use von Neumann’s minimax
principle to switch the first and second player, and deduce that the value of the game is bounded using the
definition of sb .

                         T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                     6
        T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

1.1.2   The random walk sampler
From the algorithmic perspective, it turns out that subgaussianity is a very natural property in the
context of random walk approaches to discrepancy minimization. Our results can thus be seen as a good
justification for the random walk approaches to making Banaszczyk’s theorem constructive.
    At a high level, in such approaches one runs a random walk over the coordinates of a “fractional
coloring” χ ∈ [−1, 1]n until all the coordinates hit either 1 or −1. The steps of such a walk usually come
from Gaussian increments (though not necessarily spherical), which try to balance the competing goals of
keeping discrepancy low and moving the fractional coloring χ closer to {−1, 1}n . Since a sum of small
centered Gaussian increments is subgaussian with the appropriate parameter, it is natural to hope that the
output of a correctly implemented random walk is subgaussian. Our main result in this setting is that
this is indeed possible to a limited extent, with the main caveat being that the walk’s output will not be
“subgaussian enough” to fully recover Banaszczyk’s theorem.
Theorem 1.4. Let v1 , . . . , vn ∈ Rn be vectors of `2 norm at most 1 and let t ∈ ∑ni=1 [−vi , vi ]. Then, there
is an expected polynomial-time algorithm which outputs a random coloring χ ∈ {−1, 1}n such that the
                                       √
random variable ∑ni=1 χi vi − t is O( log n)-subgaussian.
    To achieve the above sampler, we guide our random walk using solutions to the so-called vector
Kómlos program, whose feasibility was first given by Nikolov [26], and show subgaussianity using
well-known martingale concentration bounds. Interestingly, the random walk’s analysis does not rely on
phases, and is instead based on a simple relation between the walk’s convergence time and the subgaussian
parameter. As an added bonus, we also give a new and simple constructive proof of the feasibility of the
vector Kómlos program (see Section 5 for details) which avoids the use of an SDP solver.
    Given the results of the previous section, the above random walk is a universal sampler for constructing
the following colorings.
Corollary 1.5. Let v1 , . . . , vn ∈ Rn be vectors of `2 norm at most 1, let t ∈ ∑ni=1 [−vi , vi ], and let K ⊆ Rn
be a symmetric convex body of Gaussian measure 1/2 (given by a membership oracle). Then, there is
an expected polynomial-time algorithm which outputs a coloring χ ∈ {−1, 1}n such that ∑ni=1 χi vi − t ∈
   √
O( log n)K.
    As mentioned previously, the best previous algorithms in this setting are due to Rothvoss [30], Eldan
and Singh [16], which find a signed combination inside O(log n)K. Furthermore, these algorithms are not
universal, i. e., they heavily depend on the body K. We note that these algorithms are in fact tailored to
find partial colorings inside a symmetric convex body K of Gaussian measure at least 2−cn , for c > 0
small enough, a setting in which our sampler does not provide any guarantees.
    We now recall prior work on random walk based discrepancy minimization. The random walk
approach was pioneered by Bansal [4], who used a semidefinite program to guide the walk and gave
                                                     √
the first efficient algorithm matching the classic O( n) bound of Spencer [31] for the combinatorial
discrepancy of set systems satisfying m = O(n). Later, Lovett and Meka [23] provided a greatly simplified
walk, removing the need for the semidefinite program, which recovered the full power of Beck’s entropy
method for constructing partial colorings. Harvey, Schwartz, and Singh [19] defined another random
walk based algorithm, which, unlike previous work and similarly to our algorithm, does not explicitly use
phases or produce partial colorings. The random walks of [23] and [19] both depend on the convex body

                        T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                   7
          DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

K; the walk in [23] is only well-defined in a polytope, while the one in [19] remains well-defined in any
convex body, although the analysis still applies only to the polyhedral setting. Most directly related to
this paper is the recent work [5], which gives a random walk that can be viewed as a randomized variant
of the original 2t − 1 Beck–Fiala proof. This walk induces a distribution χ ∈ {−1, 1}n on colorings for
which each coordinate of the output ∑ni=1 χi vi is O(1)-subgaussian. From the discrepancy perspective,
this gives a sampler which finds colorings inside any axis parallel box of Gaussian measure at least 1/2
(and their rotations, though not in a universal manner), matching Banaszczyk’s result for this class of
convex bodies.

1.1.3   Asymmetric convex bodies
In this section, we explain how our techniques extend to the asymmetric setting. The main difficulty in
the asymmetric setting is that one cannot hope to increase the Gaussian mass of an asymmetric convex
body by simply scaling it. In particular, if we take K ⊆ Rn to be a halfspace through the origin, e. g.,
{x ∈ Rn : x1 ≥ 0}, then K has Gaussian measure exactly 1/2 but sK = K for all s > 0. At a technical
level, the lack of any measure increase under scaling breaks the proof of Lemma 1.3, which is crucial for
showing that subgaussian coloring distributions produce combinations that land inside K.
    The main idea to circumvent this problem will be to reduce to a setting where the mass of K is
“symmetrically distributed” about the origin, in particular, when the barycenter of K under the induced
Gaussian measure is at the origin. For such a body K, we show that a constant factor scaling of K ∩ −K
also has Gaussian mass at least 1/2, yielding a direct reduction to the symmetric setting.
    To achieve this reduction, we will use a novel recentering procedure, which will both carefully fix
certain coordinates of the coloring as well as shift the body K to make its mass more “symmetrically
distributed.” The guarantees of this procedure are stated below:

Theorem 1.6 (Recentering Procedure). Let v1 , . . . , vn ∈ Rn be linearly independent, t ∈ ∑ni=1 [−vi , vi ], and
K ⊆ Rn be a convex body of Gaussian measure at least 1/2. Then, there exists a fractional coloring
x ∈ [−1, 1]n , such that for p = ∑ni=1 xi vi − t, Ax = {i ∈ [n] : xi ∈ (−1, 1)} and W = span(vi : i ∈ Ax ), the
following inequalities hold.

   1. p ∈ K.

   2. The Gaussian measure of (K − p) ∩W on W is at least the Gaussian measure of K.

   3. The barycenter of (K − p) ∩W is at the origin, i. e.,
                                             Z
                                                               2
                                                         ye−kyk /2 dy = 0.                                 (1.2)
                                               (K−p)∩W


    By convention, if the procedure returns a full coloring x ∈ {−1, 1}n (in which case, since p ∈ K,
we are done), we shall treat conditions 2 and 3 as satisfied, even though W = {0}. At a high level, the
recentering procedure allows us to reduce the initial vector balancing problem to one in a possibly lower
dimension with respect to “well-centered” convex body of no smaller Gaussian measure, and in particular,
of Gaussian measure at least 1/2. Interestingly, as mentioned earlier in the introduction, the recentering

                        T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                  8
       T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

procedure can also be extended to yield a full coloring algorithm. We explain the high level details of its
implementation together with this extension in the next subsection.
     To explain how to use the fractional coloring x from Theorem 1.6 to get a useful reduction, recall the
lifting function Lx : [−1, 1]Ax → [−1, 1]n defined in (1.1). We reduce the initial vector balancing problem
to the problem of finding a coloring χ ∈ {−1, 1}Ax such that

                                     ∑ χi vi − ∑ xi vi ∈ (K − p) ∩W
                                    i∈Ax       i∈Ax

(note that ∑i∈Ax χi vi − ∑i∈Ax xi vi ∈ W by construction). Then we can lift this coloring to Lx (χ), which
satisfies
                                                                 n
                         ∑ χi vi − ∑ xi vi ∈ (K − p) ∩W ⇔ ∑ Lx (χ)i vi − t ∈ K.
                        i∈Ax       i∈Ax                       i=1

                                     def
    From here, the guarantee that K 0 = (K − p) ∩W has Gaussian measure at least 1/2 and barycenter at
the origin allows a direct reduction to the symmetric setting. Namely, we can replace K 0 by the symmetric
convex body K 0 ∩ −K 0 without losing “too much” of the Gaussian measure of K 0 . This is formalized by
the following extension of Lemma 1.3, which directly implies a reduction to subgaussian sampling as in
Section 1.1.1.

Lemma 1.7. Let Y ∈ Rn be an s-subgaussian random variable. There exists an absolute constant
c > 0, such for any convex body K ⊆ Rn of Gaussian measure at least 1/2 and barycenter at the origin,
Pr[Y ∈ s · c(K ∩ −K)] ≥ 1/2.

    In particular, if there exists a distribution over colorings χ ∈ {−1, 1}Ax such that

                                             ∑ χi vi − ∑ xi vi
                                            i∈Ax       i∈Ax

as above is 1/c-subgaussian, Lemma 1.7 implies that the random signed combination lands inside K 0
with probability at least 1/2. Thus, the asymmetric setting can be effectively reduced to the symmetric
one, as claimed.
    Crucially, the recentering procedure in Theorem 1.6 can be implemented in probabilistic polynomial
time if one relaxes the barycenter condition from being exactly 0 to having “small” norm (see Section 6.3
for details). Furthermore, the estimate in Lemma 1.7 will be robust to such perturbations. Thus, to
constructively recover the colorings in the asymmetric setting, it will still suffice to be able to generate
good subgaussian coloring distributions.
    Combining the sampler from Theorem 1.4 together with the recentering procedure, we constructively
                                                                      √
recover Banaszczyk’s theorem for general convex bodies up to a O( log n) factor.

Theorem 1.8 (Weak Constructive Banaszczyk). There exists a probabilistic polynomial-time algorithm
which, on input a linearly independent set of vectors v1 , . . . , vn ∈ Rn of `2 norm at most 1, and t ∈
∑ni=1 [−vi , vi ], and a (not necessarily symmetric) convex body K ⊆ Rn of Gaussian measure at least 1/2
(given by a membership
                     p         oracle), computes a coloring χ ∈ {−1, 1}n such that with high probability
∑ni=1 χi vi − t ∈ c log(2n) K, where C is an absolute constant.

                       T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                              9
          DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

    As far as we are aware, the above theorem gives the first algorithm to recover Banaszczyk’s result for
asymmetric convex bodies under any non-trivial restriction. In this context, we note that the algorithm
of Eldan and Singh [16] finds “relaxed” partial colorings, i. e.„ where the fractional coordinates of the
coloring are allowed to fall outside [−1, 1], lying inside an n-dimensional convex body of Gaussian
measure at least 2−cn . However, it is unclear how one could use such partial colorings to recover the
above result, even with a larger approximation factor.

1.1.4   The recentering procedure
In this section, we describe the details of the recentering procedure. We leave a thorough description of
its algorithmic implementation however to Section 6.3, and only provide its abstract instantiation here.
     Before we begin, we give a more geometric view of the vector balancing problem and the recentering
procedure, which help clarify the exposition. Let v1 , . . . , vn ∈ Rn be linearly independent vectors and
t ∈ ∑ni=1 [−vi , vi ]. Given the target body K ⊆ Rn of Gaussian measure at least 1/2, we can restate the vector
balancing problem geometrically as that of finding a vertex of the parallelepiped P = ∑ni=1 [−vi , vi ] − t
lying inside K. Here, the choice of t ensures that 0 ∈ P. Note that this condition is necessary, since
otherwise there exists a halfspace separating P from 0 having Gaussian measure at least 1/2.
     Recall now that in the linear setting, and using this geometric language, Banaszczyk’s theorem implies
that if P contains the origin, and maxi∈[n] kvi k2 ≤ 1/10 (which we do not need to assume here), then any
convex body of Gaussian measure at least 1/2 contains a vertex of P. Thus, for our given target body K,
we should make our situation better replacing P and K by P − q and K − q, if q ∈ P is a shift such that
K − q has higher Gaussian measure than K. In particular, given the symmetry of Gaussian measure, one
would intuitively expect that if the Gaussian mass of K is not symmetrically distributed around 0, there
should be a shift of K which increases its Gaussian measure.
     In the current language, fixing a color χi ∈ {−1, 1} for vector vi , corresponds to restricting ourselves to
finding a vertex in the facet F = χi vi + ∑ j6=i [−v j , v j ] − t of P lying inside K. Again intuitively, restricting
to a facet of P should improve our situation if the Gaussian measure of the corresponding slice of K in
the lower dimension is larger than that of K. To make this formal, note that when inducting on a facet F
of P (which is an n − 1 dimensional parallelepiped), we must choose a center q ∈ F to serve as the new
origin in the lower dimensional space. Precisely, this can be expressed as inducting on the parallelepiped
F − q and shifted slice (K − q) ∩ span(F − q) of K, using the n − 1 dimensional Gaussian measure on
span(F − q).
     With the above viewpoint, one can restate the goal of the recentering procedure as that of finding
a point q ∈ P ∩ K, such that smallest facet F of P containing q, satisfies that (K − q) ∩ span(F − q) has
its barycenter at the origin and Gaussian measure no smaller than that of K. Recall that as long as
(K − q) ∩ span(F − q) has Gaussian measure at least 1/2, we are guaranteed that 0 ∈ K − q ⇒ q ∈ K.
With this geometry in mind, we implement the recentering procedure as follows:
     Compute q ∈ P so that the Gaussian mass of K − q is maximized. If q is on the boundary of P, letting
F denote a facet of P containing q, induct on F − q and the slice (K − q) ∩ span(F − q) as above. If q is
in the interior of P, replace P and K by P − q and K − q, and terminate.
     We now explain why the above achieves the desired result. Firstly, if the maximizer q is in a facet F of
P, then a standard convex geometric argument reveals that the Gaussian measure of (K − q) ∩ span(F − q)
is no smaller than that of K − q, and in particular, no smaller than that of K. Thus, in this case, the

                         T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                    10
        T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

recentering procedure fixes a color for “free.” In the second case, if q is in the interior of P, then a
variational argument gives that the barycenter of K − q under the induced Gaussian measure must be at
the origin, namely,
                                            Z
                                                       2
                                                   xe−x /2 dx = 0.
                                             K−q

     To conclude this section, we explain how to extend the recentering procedure to directly produce
a deterministic coloring satisfying Theorem 1.8. For this purpose, we shall assume that v1 , . . . , vn have
                      √
length at most c/ log n, for a small enough constant c > 0. To begin, we run the recentering procedure
as above, which returns P and K, with K having its barycenter at the origin. We now replace P, K by a
joint scaling αP, αK, for α > 0 a large enough constant, so that αK has Gaussian mass at least 3/4. At
this point, we run the original recentering procedure again with the following modification: every time
we get to the situation where K has its barycenter at the origin, induct on the closest facet of P closest
to the origin. More precisely, in this situation, compute a point p on the boundary of P closest to the
origin, and, letting F denote the facet containing p, induct on F − p and (K − p) ∩ span(F − p). At the
end, return the final found vertex.
     Notice that, as claimed, the coloring (i. e., vertex) returned by the algorithm is indeed deterministic.
The reason the above algorithm works is the following. While we cannot guarantee, as in the original
recentering procedure, that the Gaussian mass of (K − p) ∩ span(F − p) does not decrease, we can instead
                                                                                     √
show that it decreases only very slowly. In particular, we use the bound of O(1/ log n) on the length of
the vectors v1 , . . . , vn to show that every time we induct, the Gaussian mass drops by at most a 1 − c/n
factor. More generally, if the vectors had length at most d > 0, for d small enough, the drop would be
                                2
of the order 1 − ce−1/(cd) , for some constant c > 0. Since we “massage” K to have Gaussian mass at
least 3/4 before applying the modified recentering algorithm, this indeed allows to induct n times while
keeping the Gaussian mass above 1/2, which guarantees that the final vertex is in K. To derive the bound
on the rate of decrease of Gaussian mass, we prove a new inequality on the Gaussian mass of sections of
a convex body near the barycenter (see Theorem 7.1), which may be of independent interest.
     As a final remark, we note that unlike the subgaussian sampler, the recentering procedure is not
scale invariant. Namely, if we jointly scale P and K by some factor α, the output of the recentering
procedure will not be an α-scaling of the output on the original K and P, as Gaussian measure is not
homogeneous under scalings. Thus, one must take care to appropriately normalize P and K before
applying the recentering procedure to achieve the desired results.
     We now give the high level overview of our recentering step implementation. The first crucial
observation in this context, is that the task of finding t ∈ P maximizing the Gaussian measure of K − t
is in fact a convex program. More precisely, the objective function (Gaussian measure of K − t) is a
logconcave function of t and the feasible region P is convex. Hence, one can hope to apply standard
convex optimization techniques to find the desired maximizer.
     It turns out however, that one can significantly simplify the required task by noting that the recentering
strategy does not in fact necessarily need an exact maximizer, or even a maximizer in P. To see this,
note that if p is a shift such that K − p has larger Gaussian measure than K, then by logconcavity the
shifts K − α p, 0 < α ≤ 1, also have larger Gaussian measure. Thus, if a we find a shift p ∈          / P with
larger Gaussian measure, letting α p be the intersection point with the boundary ∂ P, we can induct on the
facet of P − α p containing 0 and the corresponding slice of K − α p just as before. Given this, we can

                       T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                11
         DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

essentially “ignore” the constraint p ∈ P and we treat the optimization problem as unconstrained.
    This last observation will allow us to use the following simple gradient ascent strategy. Precisely, we
simply take steps in the direction of the gradient until either we pass through a facet of P or the gradient
becomes “too small.” As alluded to previously, the gradient will exactly equal a fixed scaling of the
barycenter of K − p, p the current shift, under the induced Gaussian measure. Thus, once the gradient
is small, the barycenter will be very close to the origin, which will be good enough for our purposes.
The last nontrivial technical detail is how to efficiently estimate the barycenter, where we note that the
barycenter is the expectation of a random point inside K − p. For this purpose, we simply take an average
of random samples from K − p, where we generate the samples using rejection sampling, using the fact
that the Gaussian measure of K is large.


Conclusion and follow-up work. In conclusion, we have shown a tight connection between the
existence of subgaussian coloring distributions and Banaszczyk’s vector balancing theorem. Furthermore,
we make use of this connection to constructively recover a weaker version of this theorem.
    Subsequently to the publication of an extended abstract of this paper [14], the first three named
authors, jointly with Bansal, showed the existence of a polynomial-time sampling algorithm for an
O(1)-subgaussian distribution on signed sums of vectors with `2 norm at most 1 [6]. Together with the
results in this paper (specifically Corollary 6.6), this sampling algorithm implies a new constructive proof
of Banaszczyk’s theorem in its full generality.
    Despite this success, Banaszczyk’s more recent results on signed series [3], and the last author’s
upper bounds on the discrepancy of axis-aligned boxes [27], both proved using the techniques used in [2],
remain out of reach algorithmically. Extending the techniques of this paper and of [6] to give constructive
proofs of these results is an interesting open problem.


Organization. In Section 2, we provide necessary preliminary background material. In Section 3, we
give the proof of the equivalence between Banaszczyk’s vector balancing theorem and the existence of
subgaussian coloring distributions. In Section 4, we give our random walk based coloring algorithm,
which exploits the feasibility of the vector Komlós program, proved constructively in Section 5. In
Section 6, we describe the recentering procedure and its implementation, and give the algorithmic
reduction from asymmetric bodies to symmetric bodies, giving the proof of Theorem 1.8. In Section 7,
we show how to extend the recentering procedure to a full coloring algorithm. In Section 8, we prove the
main technical estimate on the Gaussian measure of slices of a convex body near the barycenter, which is
needed for the algorithm in Section 7.


Acknowledgments. We would like to thank the American Institute for Mathematics for hosting a recent
workshop on discrepancy theory, where some of this work was done.


2    Preliminaries
Basic concepts. We write log x and log2 x, x > 0, for the logarithm base e and base 2 respectively.

                      T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                              12
        T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

    For a vector x ∈ Rn , we define                    s
                                                           n
                                              kxk2 =       ∑ xi2
                                                        i=1

to be its Euclidean norm. Let Bn2 = {x ∈ Rn : kxk2 ≤ 1} denote the unit Euclidean ball and Sn−1 =
{x ∈ Rn : kxk2 = 1} = ∂ Bn2 denote the unit sphere in Rn . For x, y ∈ Rn , we denote their inner product
hx, yi = ∑ni=1 xi yi .
     For subsets A, B ⊆ Rn , we denote their Minkowski sum A + B = {a + b : a ∈ A, b ∈ B}. Define
span(A) to be the smallest linear subspace containing A. We denote the boundary of A by ∂ A. We use the
phrase ∂ A relative to span(A) to specify that we are computing the boundary with respect to the subspace
topology on span(A).
     A set K ⊆ Rn is convex if for all x, y ∈ K,λ ∈ [0, 1], λ x + (1 − λ )y ∈ K. K is symmetric if K = −K.
We shall say that K is a convex body if additionally it is closed and has non-empty interior. We note that
the usual terminology, a convex body is also compact (i. e., bounded), but we will state this explicitly
when it is necessary. If convex body contains the origin in its interior, we say that K is 0-centered.
     We will need the concept of a gauge function for 0-centered convex bodies. For bounded symmetric
convex bodies, this functional will define a standard norm.

Proposition 2.1. Let K ⊆ Rn be a 0-centered convex body. Defining the gauge function of the body K by
kxkK = inf {s ≥ 0 : x ∈ sK}, the following holds.

   1. Finiteness: kxkK < ∞, for x ∈ Rn .

   2. Positive homogeneity: kλ xkK = λ kxkK , for x ∈ Rn , λ ≥ 0.

   3. Triangle inequality: kx + ykK ≤ kxkK + kykK , for x, y ∈ Rn .

Furthermore, if K is additionally bounded and symmetric, then k · kK is a norm which we call the norm
induced by K. In particular, k · kK additionally satisfies that kxkK = 0 iff x = 0 and kxkK = k − xkK , ∀x ∈
Rn .

Gaussian and subgaussian random variables. We define n-dimensional standard Gaussian X ∈ Rn
to be the random variable with density
                                        1       2
                                       √ n e−kxk2 /2
                                        2π
for x ∈ Rn .

Definition 2.2 (Subgaussian Random Variable). A random variable X ∈ R is σ -subgaussian, for σ > 0,
if ∀t ≥ 0,
                                                      1       2
                                    Pr[|X| ≥ t] ≤ 2e− 2 (t/σ ) .
We note that the canonical example of a 1-subgaussian distribution is the 1-dimensional standard Gaussian
itself.
     For a vector valued random variable X ∈ Rn , we say that X is σ -subgaussian if all its one dimensional
marginals are. Precisely, X is σ -subgaussian if ∀θ ∈ Sn−1 , the random variable hX, θ i is σ -subgaussian.

                       T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                             13
         DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

    We remark that from Definition 2.2, it follows directly that if X is σ -subgaussian then αX is |α|σ -
subgaussian for any α ∈ R.
    The following standard lemma allows us to deduce subgaussianity from upper bounds on the Laplace
transform of a random variable. We include a proof in the appendix for completeness.

Lemma 2.3. Let cosh(x) = 21 (ex + e−x ) for x ∈ Rn . Let X ∈ Rn be a random vector. Assume that
                                                           2
                             E[cosh(hw, Xi)] ≤ β ekσ wk2 /2 , ∀w ∈ Rn ,
                                         p
for some σ > 0 and β ≥ 1. Then X is σ log2 β + 1-subgaussian. Furthermore, for X ∈ Rn standard
                                2
Gaussian, E[cosh(hw, Xi)] = ekwk2 /2 for w ∈ Rn .

Gaussian measure. We define γn to be the n-dimensional Gaussian measure on Rn . Precisely, for any
measurable set A ⊆ Rn ,
                                            1
                                               Z
                                                       2
                                γn (A) = √ n e−kxk2 /2 dx,                                   (2.1)
                                            2π A
noting that γn (Rn ) = 1. We will also need lower dimensional Gaussian measures restricted to linear
subspaces of Rn . Thus, if A ⊆ W , W ⊆ Rn a linear subspace of dimension k, then γk (A) should be
understood as the Gaussian measure of A within W , where W is treated as the whole space. When
convenient, we will also use the notation γW (A) to denote γdim(W ) (A ∩W ). When treating one dimensional
Gaussian measure, we will often denote γ1 ((a, b)), where (a, b) is an interval, simply by γ1 (a, b) for
notational convenience. By convention, we define γ0 (A) = 1 if 0 ∈ A and 0 otherwise.
    An important concept used throughout the paper is that of the barycenter under the induced Gaussian
measure.

Definition 2.4 (Barycenter). For a convex body K ⊆ Rn , we define its barycenter under the induced
Gaussian measure, by
                                           1               dx
                                              Z
                                                       2
                                 b(K) = √ n xe−kxk2 /2          .
                                           2π K          γn (K)
Note that b(K) = E[X], if X is the random variable supported on K with probability density

                                           1       2
                                          √ n e−kxk2 /2 /γn (K).
                                           2π
Extending the definition to slices of K, for any linear subspace W ⊆ Rn , we refer to the barycenter of
K ∩W to denote the one relative to the dim(W )-dimensional Gaussian measure on W (i. e., treating W as
the whole space).

    Throughout the paper, we will need many inequalities regarding the Gaussian measure. The first
important inequality is the Prékopa-Leindler inequality, which states that for λ ∈ [0, 1] and A, B, λ A +
(1 − λ )B ⊆ Rn measurable subsets, that

                                 γn (λ A + (1 − λ )B) ≥ γn (A)λ γn (B)1−λ .                          (2.2)

                      T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                            14
       T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

We note that the Prékopa-Leindler inequality applies more generally to any logconcave measure on Rn ,
i. e., a measure defined by a density whose logarithm is concave. Importantly, this inequality directly
implies that if A ⊆ Rn is convex, then log γn (A + t), for t ∈ Rn , is a concave function of t.
      We will need the following powerful inequality of Ehrhard, which provides a crucial strengthening of
Prékopa-Leindler for Gaussian measure.

Theorem 2.5 (Ehrhard’s inequality [15, 10]). For Borel sets A, B ⊆ Rn and 0 ≤ λ ≤ 1,

                     Φ−1 (γn (λ A + (1 − λ )B)) ≥ λ Φ−1 (γn (A)) + (1 − λ )Φ−1 (γn (B))

where Φ(a) = γ1 ((−∞, a]) for all a ∈ R.

   The power of the Ehrhard inequality is that it allows us to reduce many non-trivial inequalities about
Gaussian measure to two dimensional ones.
   One can use it to show the following standard inequality on the Gaussian measures of slices of a
convex body. We include a proof for completeness.

Lemma 2.6. Given a convex body K ⊆ Rn with γn (K) ≥ 1/2, and a linear subspace H ⊆ Rn of dimension
k. Then, γk (K ∩ H) ≥ γn (K).

Proof. Clearly it suffices to prove the lemma for k = n − 1. Since Gaussian distribution is rotation
invariant, without loss of generality, H = {x ∈ Rn : x1 = 0}. Let Kt = {x ∈ K : x1 = t} denote a slice of
K at x1 = t. Then,
                                            Z ∞ −t 2 /2
                                                e
                                   γn (K) =     √ γn−1 (Kt − te1 )dt
                                             −∞   2π
where γn−1 (Kt − te1 ) = 0 outside support of K.
   Define W ⊆ R2 as W = {(x, y) : y ≤ f (x)} where f : R → R is defined as

                                        f (t) = Φ−1 (γn−1 (Kt − te1 ))

and f (t) = −∞ outside the support of K. It follows that γ2 (W ) = γn (K) ≥ 1/2. By Ehrhard’s inequality,
 f is concave on its support. Hence, W is a closed convex body.
     Let g = Φ−1 (γn (K)) ≥ 0. γn−1 (K ∩H) ≥ γn (K) is then equivalent to showing (0, g) ∈ W . If (0, g) 6∈ W ,
then there exists a halfspace L such that W ⊆ L and (0, g) 6∈ L. Let d be the distance of origin (0, 0) from
∂ L, the boundary of L. Since (0, g) 6∈ L and γ2 (L) ≥ 1/2, d < g. But this implies

                            γ2 (L) = γ1 (−∞, d) < γ1 (−∞, g) = γn (K) = γ2 (W )

contradicting W ⊆ L.

Vector balancing: Reduction to the linearly independent case. In this section, we detail the standard
vector balancing reduction to the case where the vectors are linearly independent. We will also cover
some useful related concepts and definitions, which will be used throughout the paper.

                       T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                15
         DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

Definition 2.7 (Lifting Function). For a fractional coloring x ∈ [−1, 1]n , denote the set of fractional
coordinates by Ax = {i ∈ [n] : xi ∈ (−1, 1)}. From here, for z ∈ [−1, 1]Ax , we define the lifting function
Lx : [−1, 1]Ax → [−1, 1]n by
                                           (
                                            zi : i ∈ Ax ,
                                 Lx (z)i =                      ∀i ∈ [n] .
                                            xi : i ∈ [n] \ Ax ,

Importantly, for χ ∈ {−1, 1}Ax we have that Lx (χ) ∈ {−1, 1}n . Thus, Lx sends full colorings in {−1, 1}Ax
to full colorings in {−1, 1}n .
   The lifting function above is useful in that it allows us, given a fractional coloring x ∈ [−1, 1] with
some of its coordinates set to {−1, 1}, to reduce any linear vector balancing problem to one on a smaller
number of coordinates. We detail this in the following lemma.
Lemma 2.8. Let v1 , . . . , vm ∈ Rm , t ∈ ∑ni=1 [−vi , vi ], and K ⊆ Rn . Then given a fractional coloring
x ∈ [−1, 1]n and p = ∑ni=1 xi vi − t, the following holds.
   1. For z ∈ [−1, 1]Ax , we have that
                                                                             n
                                             ∑ zi vi − ∑ xi vi + p = ∑ Lx (z)i vi − t .
                                             i∈Ax          i∈Ax             i=1


   2. For z ∈ [−1, 1]Ax , we have that
                                                                                n
                                      ∑ zi vi − ∑ xi vi ∈ K − p ⇔ ∑ Lx (z)i vi − t ∈ K .
                                      i∈Ax          i∈Ax                    i=1

Proof. The first part follows from the computation

                              ∑ Lx (z)i vi − t = ∑ zi vi + ∑ xi vi − t
                              i∈[n]                        i∈Ax      i∈[n]\Ax

                                                      = ∑ zi vi − ∑ xi vi + ( ∑ xi vi − t)
                                                           i∈Ax      i∈Ax           i∈[n]

                                                      = ∑ zi vi − ∑ xi vi + p .
                                                           i∈Ax      i∈Ax

The second part follows since
                                                                                             n
             ∑ zi vi − ∑ xi vi ∈ K − p ⇔ ∑ zi vi − ∑ xi vi + p ∈ K ⇔ ∑ Lx (z)i vi − t ∈ K ,
            i∈Ax       i∈Ax                          i∈Ax         i∈Ax                      i=1

where the last equivalence is by part (1).

     In terms of a reduction, the above lemma says in words that the linear vector balancing problem
with respect to the vectors (vi : i ∈ [n]), shift t and set K, reduces to the linear discrepancy problem on
(vi : i ∈ Ax ), shift ∑i∈Ax xi vi and set K − p.
     We now give the reduction to the linearly independent setting.

                      T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                             16
        T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

Lemma 2.9. Let v1 , . . . , vn ∈ Rm , t ∈ ∑ni=1 [−vi , vi ]. Then, there is a polynomial-time algorithm computing
a fractional coloring x ∈ [−1, 1]n such that the following hold.

   1. ∑ni=1 xi vi = t.

   2. The vectors (vi : i ∈ Ax ) are linearly independent.

   3. For z ∈ [−1, 1]Ax , ∑i∈Ax zi vi − ∑i∈Ax xi vi = ∑ni=1 Lx (z)i vi − t.

Proof. Let x denote a basic feasible solution to the linear system
                              (                                                    )
                                                  n
                                           n
                                     y ∈ R : ∑ yi vi = t, yi ∈ [−1, 1] ∀i ∈ [n]        ,
                                               i=1

which clearly can be computed in polynomial time. Note the system is feasible by construction of t. We
now show that x satisfies the required conditions.
    Let r ≤ n denote the rank of the matrix (v1 , . . . , vn ). Since x is basic, it must satisfy at least n least
of the constraints at it equality. In particular, at least n − r of the bound constraints must be tight. Thus,
since Ax is the set of fractional coordinates, we must have |Ax | ≤ r. Furthermore, the vectors (vi : i ∈ Ax )
must be linearly independent, since otherwise x is not basic. Finally, for z ∈ [−1, 1]Ax , we have that

                   ∑ zi vi − ∑ xi vi = ∑ zi vi + ∑ xi vi − ∑ xi vi = ∑ Lx (z)i vi − t ,
                   i∈Ax       i∈Ax         i∈Ax              i∈[n]\Ax      i∈[n]   i∈[n]

as needed.

    Let us now apply the above lemma to both the vector balancing problem and the subgaussian sampling
problem. First assume that we have a vector balancing problem with respect to v1 , . . . , vn ∈ Rm , shift
t ∈ ∑ni=1 [−vi , vi ], and K ⊆ Rm a convex body of Gaussian measure at least 1/2. Then applying the above
lemma, we get x ∈ [−1, 1]n , such that our vector balancing reduces to the one with respect to (vi : i ∈ Ax ),
shift
                                            ∑ xi vi ∈ ∑ [−vi , vi ],
                                                  i∈Ax           i∈Ax

and K. This follows directly from Lemma 2.9 part 3 using the lifting function Lx . Now let W =
span(vi : i ∈ Ax ), where dim(W ) = |Ax | by linear independence. Clearly, the reduced vector balancing
problem looks for signed combinations in W , and hence we may replace K by K ∩W . Here, note that by
Lemma 2.6,
                                      γ|Ax | (K ∩W ) ≥ γm (K) ≥ 1/2.
Hence, this reduction reduces to a problem of the same type, where in addition, the vectors form a basis
of the ambient space W . For the subgaussian sampling problem, by part 3 of Lemma 2.9, sampling a
random coloring χ ∈ {−1, 1}n such that ∑ni=1 χi vi −t is subgaussian clearly reduces to sampling a random
coloring χ ∈ {−1, 1}Ax such that
                                           ∑ χi vi − ∑ xi vi
                                                      i∈Ax          i∈Ax


                          T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                17
          DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

is subgaussian since this equals ∑ni=1 Lx (χ)i vi − t. Furthermore, since the support of such a support
distribution lives in W , to test subgaussianity we need only check the marginals
                                           *                    +
                                           θ , ∑ χi vi − ∑ xi vi
                                              i∈Ax      i∈Ax

for θ ∈ W ∩ Sm−1 . Thus, we may assume that W is the full space. This completes the needed reductions.

Computational model. To formalize how our algorithms interact with convex bodies, we will use the
following computational model.
    To interact algorithmically with a convex body K ⊆ Rn , we will assume that K is presented by a
membership oracle. Here a membership oracle OK on input x ∈ Rn , outputs 1 if x ∈ K and 0 otherwise.
Interestingly, since we will always assume that our convex bodies have Gaussian measure at least 1/2,
we will not need any additional centering (known point inside K) or well-boundedness (inner contained
and outer containing ball) guarantees.
    The running time of our algorithms will be measured by the number of oracle calls and arithmetic
operations they perform. We note that we use a simple model of real computation here, where we assume
that our algorithms can perform standard operations on real numbers (multiplication, division, addition,
etc.) in constant time.


3    Banaszczyk’s Theorem and subgaussian distributions
In this section, we give the main equivalences between Banaszczyk’s vector balancing theorem and the
existence of subgaussian coloring distributions.
    The fundamental theorem which underlies these equivalences is known as Talagrand’s majorizing
measure theorem, which provides a nearly tight characterization of the supremum of any Gaussian
process using chaining techniques. We now state an essential consequence of this theorem, which will be
sufficient for our purposes. For a reference, see [34].
Theorem 3.1 (Talagrand). Let K ⊆ Rn be a 0-centered convex body and Y ∈ Rn be an s-subgaussian
random vector. Then for X ∈ Rn the n-dimensional standard Gaussian, we have that
                                       E[kY kK ] ≤ s ·CT · E[kXkK ] ,
where CT > 0 is an absolute constant.
    As a consequence of the above theorem together with geometric estimates proved in Subsection 9.3,
we derive the following lemma, which will be crucial to our equivalences and reductions. The first part of
the lemma is a restatement of Lemma 1.3 from the Introduction.
Lemma 3.2 (Reduction to Subgaussianity). Let Y ∈ Rn be s-subgaussian. Then,
    1. If K ⊆ Rn is a symmetric convex body with γn (K) ≥ 1/2, then
                                             E[kY kK ] ≤ 1.5 ·CT · s.
       In particular, Pr[Y ∈ 3 ·CT · sK] ≥ 1/2.

                       T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                           18
        T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

   2. If K ⊆ Rn is a convex body with γn (K) ≥ 1/2 and kb(K)k2 ≤ 32√1 2π , then
                                                                √
                                          E[kY kK∩−K ] ≤ 2(1 + π 8 ln 2) ·CT · s.
                                    √
       In particular, Pr[Y ∈ 4(1 + π 8 ln 2) ·CT · s(K ∩ −K)] ≥ 1/2.

Proof of Lemma 3.2. The proof follows immediately by combining Lemmas 9.5, Lemma 9.9 and Theo-
rem 3.1. We note that the lower bounds on the probabilities follow directly by Markov’s inequality.

    To state our equivalence, we will need the definitions of the following geometric parameters.

Definition 3.3 (Geometric Parameters). Let T ⊆ Rn be a finite set.

    • Define sg (T ) > 0 to be least number s > 0 such that there exists an s-subgaussian random vector Y
      supported on T .

    • Define sb (T ) > 0 to be the least number s > 0 such that for any symmetric convex body K ⊆ Rn ,
      γn (K) ≥ 1/2, T ∩ sK 6= 0./

    We now state our main equivalence, which gives a quantitative version of Theorem 1.2 in the
introduction.

Theorem 3.4. Let T ⊆ Rn be a finite set. Then the following inequalities hold.

   1. sb (T ) ≤ 1.5CT · sg (T ).
                √
   2. sg (T ) ≤ 2 · sb (T ).

   Using the above language, we can restate Banaszczyk’s vector balancing theorem restricted to
symmetric convex bodies as follows:

Theorem 3.5 ([2]). Let v1 , . . . , vm ∈ Rn . Then sb (∑m
                                                        i=1 {−vi , vi }) ≤ 5 maxi∈[m] kvi k2 .

   As an immediate corollary of Theorem 3.4 and Theorem 3.5 (extended to the linear setting) we
deduce:

Corollary 3.6. Let v1 , . . . , vm ∈ Rn . Then
                                       m             √
                                     sg ∑ {−vi , vi } ≤ 2 · 5 max kvi k2 .
                                           i=1                     i∈[m]


Furthermore, for t ∈ ∑m
                      i=1 [−vi , vi ],

                                     m                 √
                                   sg ∑ {−vi , vi } − t ≤ 2 · 10 max kvi k2 .
                                         i=1                          i∈[m]



                        T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                          19
         DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

     As explained in the introduction, the above equivalence shows the existence of a universal sampler for
recovering Banaszczyk’s vector balancing theorem for symmetric convex bodies up to a constant factor
in the length of the vectors. Precisely, this follows directly from Lemma 3.2 part 1 and Corollary 3.6 (for
more details see the proof of Theorem 3.4 below).
     The final ingredient we need is the classical minimax principle of von Neumann.
Theorem 3.7 (Minimax Theorem [25]). Let X ⊆ Rn , Y ⊆ Rm be compact convex sets. Let f : X ×Y → R
be a continuous function such that
   1. f (·, y) : X → R is convex for fixed y ∈ Y ,

   2. f (x, ·) : Y → R is concave for fixed x ∈ X.
Then,
                                    min max f (x, y) = max min f (x, y) .
                                       x∈X y∈Y             y∈Y x∈X

    We now proceed to the proof of Theorem 3.4.

Proof of Theorem 3.4. We treat the two inequalities separately.

Proof of 1. Let Y ∈ T be the sg (T )-subgaussian random variable. Let K ⊆ Rn be a symmetric convex
body such that γn (K) ≥ 1/2. By Lemma 3.2 part 1, we have that

                                           E[kY kK ] ≤ 1.5CT · sg (T ).

Thus, there exists x ∈ T such that x ∈ 1.5CT · sg (T )K. Since this holds for all such K, we have that
sb (T ) ≤ 1.5CT · sg (T ) as needed.

Proof of 2. Recall the definition of the hyperbolic cosine function
                                                     1
                                            cosh(x) = (ex + e−x )
                                                     2
for x ∈ R. Note that cosh is convex, symmetric (cosh(x) = cosh(−x)), and non-negative. For w ∈ Rn ,
                                                     2
define gw : Rn → R≥0 by gw (x) = cosh(hx, wi)/ekwk2 /2 . By Lemma 2.3, note that E[gw (X)] = 1 for X an
n-dimensional standard Gaussian.
    Let D denote the√ set of probability distributions on T . Our goal is to show that there exists D ∈ D
such that Y ∼ D is 2 · sb (T )-subgaussian. By homogeneity, we may replace T by T /sb (T ), and thus
assume that sb (T ) = 1. To show the existence of the subgaussian distribution, we will show that

                                           inf sup E [gw (Y )] ≤ 2.                                   (3.1)
                                           D∈D w∈Rn Y ∼D


√ Before proving the bound (3.1),      we show that this suffices to show the existence of the desired
  2-subgaussian distribution. Let D∗ ∈ D denote the minimizing distribution for (3.1). Then by definition
of gw , we have that
                                                          2
                                E ∗ [cosh(hw,Y i)] ≤ 2ekwk2 /2 ∀ w ∈ Rn .                           (3.2)
                                Y ∼D


                       T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                            20
       T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

With
  p the bounds on  √ the Laplace transform in (3.2), by Lemma 2.3 with β = 2 and σ = 1, we have that Y
is log2 2 + 1 = 2-subgaussian as needed.
    We now prove the estimate in (3.1). Let C denote the closed convex hull of the functions gw . More
precisely, C is the closure of the set of functions
           (                                                                             )
                                  k                   k
             f : T → R≥0     f = ∑ λi gwi , k ∈ N, ∑ λi = 1, λi ≥ 0 ∀i ∈ [k], wi ∈ Rn ∀i ∈ [k]   .
                                 i=1                i=1

By continuity, we clearly have that
                               inf sup E [gw (Y )] = inf sup E [ f (Y )] .                           (3.3)
                               D∈D w∈Rn Y ∼D                D∈D f ∈C Y ∼D

The strategy will now be to apply the Minimax Theorem 3.7 to (3.3). For this to hold, we first need that
both D and C are both convex and compact. This is clear for D, since D can be associated with the
standard simplex in R|T | . By construction C is also convex, hence we need only prove compactness. Since
T is finite and C is a closed subset of non-negative functions on T , C can be associated in the natural
                                  |T |
way with a closed subset of R≥0 . To show compactness, it suffices to show that this set is bounded. In
particular, it suffices to show that for f ∈ C, maxx∈T f (x) ≤ M for some universal constant M < ∞. Since
every f ∈ C is a limit of convex combinations of the functions gw , w ∈ Rn , it suffices to show that
                                           sup max gw (x) < ∞ .
                                          w∈Rn x∈T

We prove this with the following computation:
                                                                cosh(hw, xi)
                             sup max gw (x) = sup max                      2
                             w∈Rn x∈T            w∈Rn x∈T       ekwk2 /2
                                                          cosh(maxx∈T kxkkwk2 )
                                               ≤ sup                           2
                                                 w∈Rn                 ekwk2 /2
                                                                                      2
                                               ≤ sup e(maxx∈T kxk2 )kwk2 −kwk2 /2
                                                 w∈Rn
                                                                  2
                                               = e(maxx∈T kxk2 ) /2 < ∞.
Thus C is compact as needed. Lastly, note that the function EY ∼D [ f (Y )] on D ×C is bilinear, and hence
is both continuous and satisfies (trivially) the convexity-concavity conditions in Theorem 3.7.
    By compactness of D and C and continuity, we have that
                                inf sup E [ f (Y )] = min max E [ f (Y )] .
                               D∈D f ∈C Y ∼D              D∈D f ∈C Y ∼D

Next, by the Minimax Theorem 3.7, we have that
                       min max E [ f (Y )] = max min E [ f (Y )] = max min f (x)
                       D∈D f ∈C Y ∼D             f ∈C D∈D Y ∼D                     f ∈C x∈T

                                           =              sup              min f (x) .
                                                                           x∈T
                                                     f =∑ki=1 λi gwi
                                                 k
                                                ∑i=1 λi =1, λi ≥0 ∀i∈[k]
                                                 wi ∈Rn ∀i∈[k], k∈N



                      T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                            21
          DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

Take f = ∑ki=1 λi gwi as above (now as a function on Rn ) and let K f = {x ∈ Rn : f (x) ≤ 2}. Our task now
reduces to showing that ∃x ∈ T such that x ∈ K f . Since sb (T ) ≤ 1, it suffices to show that γn (K f ) =
Pr[X ∈ K f ] ≥ 1/2, for X the n-dimensional standard Gaussian, and that K f is symmetric and convex.
Since f is a convex combination of symmetric and convex functions, it follows that K f is symmetric and
convex. Since f is non-negative, by Markov’s inequality
                                                    E[ f (X)] 1 k               1 k    1
                Pr[X ∈
                     / K f ] = Pr[ f (X) ≥ 2] ≤              = ∑ λi E[gwi (X)] = ∑ λi = .
                                                        2     2 i=1             2 i=1  2
Hence Pr[X ∈ K f ] = 1 − Pr[X ∈
                              / K f ] ≥ 1/2, as needed.

          √
4    An O( log n)-subgaussian random walk
                                                                             √
In this section we describe the efficient sampling algorithm for a O( log n)-subgaussian distribution
supported on ∑m  i=1 {−vi , vi }, where maxi kvi k2 ≤ 1. This gives an efficient version of Corollary 3.6 with a
weaker subgaussianity parameter.
    The random walk algorithm is given as Algorithm 1. Step 8 can be executed in polynomial time by
either calling an SDP solver, or executing the algorithm from Section 5. The feasibility of the program is
guaranteed by Theorem 5.1, and also by the results of [26]. The matrix U(t) in step 10 can be computed
by Cholesky decomposition.
    Let us first make some observations that will be useful throughout the analysis. Notice first that the
random process χ(0), . . . , χ(T ) is Markovian. Let ui (t) be the i-th row of U(t). By the definition of Σ(t)
and U(t), kui (t)k2 = Σ(t)ii equals 1 if i ∈ A(t), and 0 otherwise. We have χ(t)i − χ(t − 1)i = γhui (t), r(t)i,
and, because r(t) ∈ {−1, 1}n , by Cauchy-Schwarz we get
                                                           ( √
                                                            γ n if i ∈ A(t),
                                    |χ(t)i − χ(t − 1)i | ≤                                                 (4.1)
                                                            0      otherwise.
    We first analyze the convergence of the algorithm: we show that, with constant probability, the
random walk fixes all coordinates to have absolute value between 1 − δ and 1. First we prepare a lemma.
Lemma 4.1. Let X0 , X1 , X2 , . . . form a martingale sequence adapted to the filtration {Ft } such that
X0 ∈ [−1, 1], and for every t ≥ 1 we have E[(Xt − Xt−1 )2 | Ft−1 ] = σ 2 , and |Xt − Xt−1 | ≤ δ < 1. Denote
τ = inf{t : |Xt | ≥ 1 − δ }. Then E[τ] < (1 − X02 )/σ 2 .
Proof. Define Y0 ,Y1 ,Y2 , . . . to be a martingale with respect to X0 , X1 , X2 , . . . defined by Yt = Xmin{t,τ} .
Because |Xt − Xt−1 | < δ , we easily see by induction that |Yt | < 1 for all t ≥ 0. Therefore, for any t ≥ 1,
we compute
                                                t
                                 1 > EYt2 = ∑ E E[(Yt −Yt−1 )2 | Ft−1 ] + X02
                                              s=1
                                               t
                                           = ∑ σ 2 Pr[τ ≥ s] + X02
                                              s=1
                                           = σ E[min{t, τ}] + X02 .
                                                2



                        T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                   22
         T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

              √
Algorithm 1 O( log n)-subgaussian Random Walk Algorithm
 1: Input: v1 , . . . , vn ∈ Rm such that maxni=1 kvi k2 ≤ 1; a vector y ∈ ∑ni=1 [−vi , vi ].
                                                                                       √
 2: Output: random signs χ1 , . . . , χn ∈ {−1, 1} such that ∑ni=1 χi vi − y is O(      log n)-subgaussian.
 3: Let V = (vi )ni=1 .
                                √
            2 log2 2n      2 2 ln 2 log2 2n
 4: Let γ =
               n5/2
                      ;δ =       n          ; T = d2/γ 2 e · dlog2 2ne;
                         n
 5: Let χ(0) ∈ [−1, 1] be such that ∑i χ(0)i vi = y.
 6: Let A(1) = {1, . . . , n}
 7: for t = 1, . . . , T do
 8:      Compute Σ(t) ∈ Rn×n , Σ(t)  0, such that

                                                   Σ(t)ii = 1 ∀i ∈ A(t)
                                                   Σ(t)ii = 0 ∀i 6∈ A(t)
                                               V Σ(t)V T  Im .

 9:     Pick r(t) ∈ {−1, +1}n uniformly at random.
10:     χ(t) = χ(t − 1) + γU(t)r(t), where U(t)U(t)| = Σ(t).
11:     A(t + 1) = {i : |χ(t)i | < 1 − δ }.
12: end for
13: Set χi = sign(χ(T )i ) for all i ∈ {1, . . . , n}.
14: if A(T ) = 0/ then
15:     Return χ
16: else
17:     Restart algorithm from line 5.
18: end if



By the monotone convergence theorem, we have that σ 2 E[τ] < 1 − X02 , which was to be proved.

      The next lemma gives our convergence analysis of the random walk.

Lemma 4.2. With probability 1, |χ(t)i | ≤ 1 for all 1 ≤ t ≤ T and all 1 ≤ i ≤ n. With probability at least
1/2, |χ(T )i | ≥ 1 − δ for all 1 ≤ i ≤ n.

Proof. We prove the first claim by induction on t. It is clearly satisfied for t = 0; assume then that the
claim holds up to t − 1, and we will prove it for t. If i 6∈ A(t), then χ(t)i = χ(t − 1)i by (4.1), and the
claim follows by the inductive hypothesis. If i ∈ A(t), then |χi (t)| < 1 − δ , and by (4.1) and the triangle
                                               √                                                    √
inequality, we have |χi (t)| < 1 − δ + γ n < 1, where the final inequality follows because γ n < δ .
     To prove the second claim, we will show that Pr[|χ(T )i | < 1 − δ ] ≤ 1/2n holds for every i. The
claim will then follow by a union bound. Let us fix an arbitrary i ∈ {1, . . . , n}. Define ∆ = d2/γ 2 e, and
let E j , 0 ≤ j ≤ (T − ∆)/∆ be the event that |χ(t)i | < 1 − δ for all t ∈ [ j∆ + 1, ( j + 1)∆]. Observe that if
|χ(t)i | ≥ 1 − δ , then |χ(s)i | = |χ(t)i | ≥ 1 − δ for all t ≤ s ≤ T , and, therefore χ(T )i < 1 − δ if and only
if all the events E0 , . . . , E(T −∆)/∆ hold simultaneously. By this observation, and the Markov property, we

                          T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                               23
          DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

have
                                                               T /∆−1
                                    Pr[|χ(T )i | < 1 − δ ] =       ∏ Pr[E j | χ( j∆)].                         (4.2)
                                                                   j=0

Let τ j = ( j + 1)∆ if E j holds and τ j = min{t ≥ j∆ : |χ(t)i | ≥ 1 − δ }, otherwise. The sequence
χ( j∆)i , . . . , χ(τ j )i , conditioned on χ( j∆), is a martingale. Moreover, since for any t ∈ [ j∆, τ j ] we
have i ∈ A(t), for all such t we get
                  E[(χ(t)i − χ(t − 1)i )2 | χ(t − 1)] = γ 2 E |hui (t), r(t)i|2 = γ 2 kui k22 = γ 2 .          (4.3)
By (4.1) and (4.3), the sequence χ( j∆)i , . . . , χ(τ j )i , conditioned on χ( j∆), satisfies the assumptions of
Lemma 4.1. By the lemma, we have
                                                                   1 − χ( j∆)2i   1
                                     E[τ j − j∆ | χ( j∆)] ≤              2
                                                                                ≤ 2.
                                                                       γ         γ
Since the event E j holds only if τ j ≥ ( j + 1)∆, by Markov’s inequality we have
                                                                      1       1
                                            Pr[E j | χ( j∆)] ≤               ≤ .
                                                                    γ 2∆      2
This bound and (4.2) imply that Pr[|χ(T )i | < 1 − δ ] ≤ 2−T /∆ ≤ 1/2n, which was to be proved.

   To prove that the walk is subgaussian, we will need the following martingale concentration inequality
due to Freedman.
Theorem 4.3 ([18]). Let Z1 , . . . , ZT be a martingale adapted to the filtration {Ft }, let |Zt − Zt−1 | ≤ M for
all t, and let Wt = ∑tj=1 E j−1 [(Z j − Z j−1 )2 | F j−1 ] = ∑tj=1 Var[Z j | F j−1 ]. Then for all λ ≥ 0 and σ 2 ≥ 0,
we have
                                                                                      λ2
                                                                                             
                                                               2
                     Pr[∃t s.t. |Zt − Z0 | ≥ λ and Wt ≤ σ ] ≤ 2 exp −                           .
                                                                                2(σ 2 + Mλ )
   Next we state the main lemma, which, together with an estimate on the error due to rounding, implies
subgaussianity.
                                                        √
Lemma 4.4. The random variable ∑ni=1 χ(T )i vi − y is (γ 2T )-subgaussian.
Proof. Define Yt = ∑ni=1 χ(t)i vi for all t = 1, . . . , T . Notice that Y0 = y. Let us fix a θ ∈ Sn−1 once and for
                                                                                                              2   2
all, and let Zt = hθ ,Yt i for t = 0, . . . , T . We need to show that for every λ > 0, Pr[|ZT | ≥ λ ] ≤ 2e−λ /2σ .
We first observe that Zt is bounded, so we only need to consider λ in a finite range. Indeed, by Lemma 4.2,
Yt ∈ ∑ni=1 [−vi , vi ] with probability 1, so by the triangle inequality, kYt k2 ≤ ∑ni=1 kvi k2 ≤ n. Then, by
Cauchy-Schwarz, |Zt | ≤ n as well, and, therefore, Pr[|ZT | > n] = 0. For the rest of the proof we will
assume that 0 < λ ≤ n.
     Observe that Z0 , . . . , ZT is a martingale. First we prove that the increments are bounded: this follows
from the boundedness of the increments of the coordinates of χ(t). Indeed, by the triangle inequality and
(4.1),
                                n                                        n
            kYt −Yt−1 k2 =     ∑ (χ(t)i − χ(t − 1)i )vi            ≤ ∑ |χ(t)i − χ(t − 1)i |kvi k2 ≤ γ n3/2 .
                               i=1                             2     i=1


                        T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                    24
        T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

Then, it follows from Cauchy-Schwarz that |Zt − Zt−1 | ≤ γn3/2 .
    Next we bound the variance of the increments. By the Markov property of the random walk, Zt − Zt−1
is entirely determined by χt−1 . Denoting V = (vi )ni=1 as in the description of Algorithm 1, we have
                E[(Zt − Zt−1 )2 | χt−1 ] = θ | E[(Yt −Yt−1 )(Yt −Yt−1 )| ]θ
                                        = θ |V E[(χ(t) − χ(t − 1))(χ(t) − χ(t − 1))| ]V | θ
                                        = γ 2 θ |VU(t) E[r(t)r(t)| ]U(t)|V | θ
                                        = γ 2 θ |V Σ(t)V | θ
                                        ≤ γ 2.
The penultimate equality follows because E[r(t)r(t)| ] = In and U(t)U(t)| = Σ(t), and the final inequality
follows because Σ(t) was chosen so that V Σ(t)V |  Im .
    We are now ready to apply Theorem 4.3. Using the notation from the theorem, we have shown that
M ≤ γn3/2 , and that Wt ≤ γ 2t for all t, and both bounds hold with probability 1. Let σ 2 = γ 2 T . First we
claim that for any λ ≤ n, Mλ ≤ σ 2 . Indeed,
                                  Mλ ≤ γn5/2 ≤ 2 log2 (2n) ≤ γ 2 T = σ 2 .
                                                                                       2   2
Now, Theorem 4.3 and the above calculation imply that Pr[|ZT − Z0 | ≥ λ ] ≤ 2e−λ /4σ for all 0 < λ ≤ n.
This proves the lemma.
    Finally we state our main theorem.
Theorem 4.5 (Restatement of Theorem 1.4). Algorithm 1 runs in expected polynomial time, and outputs
                                                                     √
a random vector χ such that the random variable ∑ni=1 χi vi − y is O( log n)-subgaussian.
Proof. Let E be the event that for all i, |χ(T )i | ≥ 1 − δ (equivalently, that A(T ) = 0). / The algorithm
takes returns if E holds, and otherwise it restarts. By Lemma 4.2 this event occurs with probability at
least 1/2, so there will be a constant number of restarts in expectation. Since the random walk takes T
steps, where T is polynomial in the input size, and each step can also be executed in polynomial time, it
follows that the expected running time of the algorithm is polynomial.
    Because the algorithm returns an output exactly when E holds, the output is distributed as the random
vector χ conditioned on E. Let us fix a vector θ ∈ Sn−1 once and for all. Let Y be the random variable   √
∑ni=1 χi vi − y and let Z = hθ ,Y i. Let Yt and Zt be defined as in the proof of Lemma 4.4. Let s = γ 2T be
the parameter with which we proved ZT − Z0 is subgaussian in Lemma 4.4. We will show that Z − Z0 ,
                                                                                             −λ 2 /8s2 . Observe
conditioned on E, is 2s-subgaussian, i. e., we will prove   √ that Pr[|Z − Z0 | ≥ λ | E] ≤ 2e
that this inequality is trivially satisfied for λ ≤ λ0 = 2 2 ln 2s, since the right hand side is at least 1 in
this range. For the rest of the proof we will assume that λ > λ0 .
    Conditional on E, and using the triangle inequality, we can bound the distance between Y and Y (T )
by
                                                   n
                              kY −Y (T )k2 =      ∑ (sign(χ(T )i ) − χ(T )i )vi
                                                  i=1                             2
                                                  n
                                             ≤ ∑ (1 − |χ(T )i |)kvi k2 ≤ δ n.
                                                 i=1


                       T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                 25
          DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

By Cauchy-Schwarz, we get that, conditional on E, |Z − ZT | ≤ δ n, which implies that |Z − Z0 | ≤
|ZT − Z0 | + δ n, by the triangle inequality. Then,
                                                p we have that, conditional on E, |Z − Z0 | ≥ λ ⇒
|ZT − Z0 | ≥ λ − δ n. For every λ > λ0 , δ n ≤ 2 2 ln 2 log2 2n ≤ λ0 /2 < λ /2. Therefore, conditional on
E and for every λ > λ0 , |Z − Z0 | ≥ λ ⇒ |ZT − Z0 | ≥ λ /2. By Lemma 4.4 we have
                                                                                                   2       2
          Pr[|Z − Z0 | ≥ λ | E] ≤ Pr[|ZT − Z0 | ≥ λ /2 | E] ≤ 2 Pr[|ZT − Z0 | ≥ λ /2] ≤ 4e−λ /4s .
                                   2   2                                                       2       2
Recall that for every λ > λ0 , e−λ /8s < 1/2, so the right hand side above is at most 2e−λ /8s , as claimed.
                                                                             √
Therefore, Z − Z0 , conditioned on E, is 2s-subgaussian, and, since s = O( log n), this suffices to prove
the theorem.


5    Constructive vector Komlós
In this section we give a new proof of the main result of [26] that the natural SDP for the Komlós problem
has value at most 1. While the proof in [26] used duality, our proof is direct and immediately yields an
algorithm to compute an SDP solution which only uses basic linear algebraic operations, and does not
need a general SDP solver. This SDP solution can then be used in the random walk in Section 4. We state
the main theorem next.

Theorem 5.1. Let v1 , . . . , vn ∈ Rm be vectors of Euclidean length at most 1, and let α1 , . . . , αn ∈ [0, 1].
There exists an n × n PSD matrix X such that

                                              Xii = αi   ∀1 ≤ i ≤ n,
                                               T
                                           V XV  Im ,

where V = (v1 , . . . , vm ) is the n × m matrix whose columns are the vectors vi .

    To prove Theorem 5.1 we make use of a basic identity about inverses of block matrices. This is a
standard use of the Schur complement and we will not prove it here.

Lemma 5.2. Let                                                  
                                                       A11 A12
                                             A=
                                                       A21 A22
be a (k + `) × (k + `) block matrix, where A11 is a k × k matrix, A12 is a k × ` matrix, A21 is a ` × k matrix,
and A22 is a ` × ` matrix. Assume A and A22 are invertible, and write B = A−1 in block form as
                                                             
                                                   B11 B12
                                           B=                   ,
                                                   B21 B22

where Bi j has the same dimensions as Ai j . Then B11 = (A11 − A12 A−1
                                                                    22 A21 )
                                                                            −1 (i. e., the inverse of the

Schur complement of A11 in A).

    From Lemma 5.2 we derive the main technical claim used in the proof of Theorem 5.1.

                       T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                   26
        T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

Lemma 5.3. Let A = V TV be an n × n positive definite matrix, and let v1 , . . . , vn be the columns of V .
Let B = A−1 . Then, for each 1 ≤ i ≤ n
                                                1            1
                                    Bii =                  ≥       ,
                                          k(I − Π−i )vi k2 kvi k22
                                                         2

where Π−i is the orthogonal projection matrix onto span{v j : j 6= i}.
Proof. It is sufficient to prove the lemma for i = 1. Let U be the matrix with columns v2 , . . . , vn . Since A
is positive definite, the principal minor U TU is positive definite as well, and, therefore, invertible. By
Lemma 5.2,
                                                         1
                                     B11 =      2    T
                                                                       .
                                           kv1 k2 − v1 U(U TU)−1U T v1
Let Π = U(U TU)−1U T . Since Π is symmetric and idempotent (i. e., Π2 = Π), it is an orthogonal
projection matrix. Moreover ΠU = U and Π has the same rank as U, so Π is the orthogonal projection
matrix onto the column span of U, i. e., U(U TU)−1U T = Π−1 and the lemma follows.
Proof of Theorem 5.1. We prove the theorem by induction on n.
    In the base case m = 1, we have a single vector v ∈ Rm , kvk2 ≤ 1, and an α ∈ [0, 1]. We set x = α,
and we clearly have vxvT  αI  I.
    We now proceed with the inductive step. Consider first the case that V TV is singular. Then there
exists a vector x 6= 0 such that V x = 0. Scale x so that xi2 ≤ αi for all i, and there exists k such that xk2 = αk .
Apply the inductive hypothesis to the vectors (vi : i 6= k) and the reals (αi0 = αi − xi2 : i 6= k) to get a
matrix Y ∈ R([n]\{k})×([n]\{k}) . Extend Y to a matrix Ỹ ∈ Rn×n by padding with 0’s, i. e., Ỹi j = Yi j if i, j 6= k
and Ỹi j = 0, otherwise. Define X = xxT + Ỹ : it is easy to verify that both conditions of the theorem are
satisfied.
    Finally, assume that V TV is invertible, and let B = (V TV )−1 . Define
                                               β = min αi /Bii ,
                                                       i
                                                k = arg min αi /Bii ,
                                                           i
                                                γ = max αi − β Bii .
                                                       i
Apply the inductive hypothesis to the vectors (vi : i 6= k) and the reals (αi0 = (αi − β bii )/γ : i 6= k) to get
a matrix Y ∈ R([n]\{k})×([n]\{k}) , which we then pad with 0’s to an n × n matrix Ỹ , as we did in the first
case above. Define X as X = β B + γ Ỹ . It is easy to verify that Xii = αi for all i. We have
                         V XV T = βV BV T + γV TỸV = βV (V TV )−1V T + γU TYU,
where U is the submatrix of V consisting of all columns of V except vk . U TYU  I by the induction
hypothesis. Since V (V TV )−1V T is symmetric and idempotent, it is an orthogonal projection matrix,
and therefore V (V TV )−1V T  I. Because Bii ≥ kvi k−2
                                                     2 ≥ 1 by Lemma 5.3, we have γ ≤ maxi αi − β .
Therefore,
                  V XV T = βV (V TV )−1V T + γU TYU  (β + γ)In  (max αi )In  In .
                                                                                  i
This completes the proof.
    Observe that the proof of Theorem 5.1 can be easily turned into an efficient recursive algorithm.

                         T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                    27
          DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

6     Recentering procedure
We now give the crucial tool to reduce the asymmetric setting to the symmetric setting, namely, the
recentering procedure corresponding to Theorem 1.6 in the introduction. We first describe the recentering
procedure and prove Theorem 1.6. Then, in Section 6.2, we detail how to use this procedure to yield the
desired reduction. Finally, in Section 6.3 and Section 6.4 we deal with algorithmic issues related to the
recentering procedure and the reduction.

6.1    Proof of Theorem 1.6
We first recall the desired guarantees. For linearly independent vectors v1 , . . . , vn ∈ Rn , a shift t ∈
∑ni=1 [−vi , vi ], and a convex body K ⊆ Rn of Gaussian measure at least 1/2, we would like to find a
fractional coloring x ∈ [−1, 1]n , such that for p = ∑ni=1 xi vi − t and the subspace W = span(vi : i ∈ Ax ),
the following hold.
    1. p ∈ K.
    2. γ|Ax | ((K − p) ∩W ) ≥ γn (K).
    3. b((K − p) ∩W ) = 0.
    We shall prove this by induction on n. Note that the base case n = 0, reduces to the statement that
0 ∈ K, which is trivial.
    For a fractional coloring x ∈ [−1, 1]n , we remember first that Ax denotes the set of fractional coordi-
nates and that Lx : [−1, 1]Ax → [−1, 1]n is the lifting function (see Definition 2.7 for details).
    Let P = ∑ni=1 [−vi , vi ]−t. Define the function f (p) = log γn (K − p) for p ∈ P. Compute the maximizer
p of f over P. Let x ∈ [−1, 1]n satisfy that p = ∑ni=1 xi vi − t and let W = span(vi : i ∈ Ax ). Note first that
since γn (K − p) ≥ γn (K) ≥ 1/2, by Lemma 9.6 part 1 we have that 0 ∈ K − p ⇒ p ∈ K.
    Assume first that p is in the interior of P. Then, since p is a maximizer and does not touch the
boundary of P, by the KKT conditions we must have that ∇ f (p) = 0. From here, direct computation
reveals that ∇ f (p) = b(K − p). Again, since y does not touch any constraints of P, we see that Ax = [n]
and hence W = Rn . Thus, as claimed, x satisfies the conditions of the theorem.
    Assume now that p ∈ ∂ P. From here, we must have that |Ax | < n and hence dim(W ) = |Ax | < n.
Next, by Lemma 2.6, we see that
                              γ|Ax | ((K − p) ∩W ) ≥ γn (K − p) ≥ γn (K) ≥ 1/2 .
By Lemma 2.8 part 2, for z ∈ [−1, 1]Ax , we have that
                                                                n
                              ∑ zi vi − ∑ xi vi ∈ K − p ⇔ ∑ Lx (z)i vi − t ∈ K .
                             i∈Ax        i∈Ax                  i=1

Thus, we may apply induction on the vectors (vi : i ∈ Ax ), the shift ∑i∈Ax xi vi and convex body (K − p)∩W ,
and recover z ∈ [−1, 1]Ax , such that for Wz = span(vi : i ∈ Az ), we get
                                                         def
                                        ∑ zi vi − ∑ xi vi = pz ∈ K − p ,                                  (6.1)
                                        i∈Ax     i∈Ax


                        T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                28
        T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

and
                                     γ|Az | ((K − p − pz ) ∩Wz ) ≥ γ|Ax | (K − p) ,                          (6.2)
and
                                              b((K − p − pz ) ∩Wz ) = 0 .                                    (6.3)
We now claim that w = Lx (z) satisfies the conditions of the theorem. To see this, note that by Lemma 2.8
part 1,
                       n                n
                      ∑ wi vi − t = ∑ Lx (z)i vi − t = ∑ zi vi − ∑ xi vi + p = pz + p .
                      i=1              i=1                 i∈Ax          i∈Ax

Furthermore, since pz ∈ K − p we have that pz + p ∈ K. Next, clearly Aw = Az and hence span(vi : i ∈
Aw ) = Wz . The claim thus follows by combining (6.1), (6.2), (6.3).

6.2   Reduction from asymmetric to symmetric convex bodies
As explained in the introduction, the recentering procedure allows us to reduce Banaszczyk’s vector
balancing theorem for all convex bodies to the symmetric case, and in particular, to the task of subgaussian
sampling. We give this reduction in detail below.
    Let v1 , . . . , vn , t, and K be as in Theorem 1.6, and let x ∈ [−1, 1]n be the fractional coloring guaranteed
by the recentering procedure. As in Theorem 1.6, let p = ∑ni=1 xi vi − t and W = span(vi : i ∈ Ax ). We
shall now assume that maxi∈[n] kvi k2 ≤ c, a constant to be chosen later. From here, by Lemma 2.8 part 2,
for χ ∈ {−1, 1}Ax ,

                            ∑ χi vi − ∑ xi vi ∈ (K − p) ∩W ⇔ ∑ Lx (χ)i vi − t ∈ K .
                           i∈Ax       i∈Ax                             i∈[n]


Let C = (K − p) ∩ W and d = |Ax |. By the guarantees on the recentering procedure, we know that
γd (C) ≥ 1/2 and b(C) = 0. Then by Lemma 9.9 in Section 9.3, for X ∈ W the d-dimensional standard
Gaussian on W , we have that
                                                               √
                             E[kXkC∩−C ] ≤ 2 E[kXkC ] ≤ 2(1 + π 8 ln 2) .
                                                  √
Hence by Markov’s inequality, Pr[X ∈ 4(1 + π 8 ln 2)(C ∩ −C))] ≥ 1/2. At this point, using Ba-
naszczyk’s theorem in the linear setting for symmetric bodies (which loses a factor of 2), if the `2 norm
bound c satisfies
                                        1               √
                                          ≥ 10 · 4(1 + π 8 ln 2) ,
                                        c
then by homogeneity there exists χ ∈ {−1, 1}Ax such that
                                                                   n
                              ∑ χi vi − ∑ xi vi ∈ C ∩ −C ⇒ ∑ Lx (χ)i vi − t ∈ K .
                              i∈Ax          i∈Ax                  i=1

Hence, the reduction to the symmetric case follows.

                        T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                   29
            DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

    We can also achieve the same with a subgaussian sampler, though the vectors should be shorter. In
particular, applying Corollary 3.6, if

                                                    1 √                     √
                                                      ≥ 2 · 10 ·CT · 4(1 + π 8 ln 2),
                                                    c

then there exists a distribution on colorings χ ∈ {−1, 1}Ax such that

                                                                   ∑ χi vi − ∑ xi vi
                                                                   i∈Ax                i∈Ax
                √
is (CT · 4(1 + π 8 ln 2))−1 -subgaussian. From here, by Lemma 3.2 part 2 applied to C,
                                 "                   !         #
                                      Pr            ∑ χi vi − ∑ xi vi                    ∈ C ∩ −C ≥ 1/2 ,
                                                    i∈Ax                   i∈Ax

as needed.

6.3     Algorithmic recentering procedure
In this subsection we give an algorithmic variant of the recentering procedure in Theorem 1.6.
    Given a convex body K ⊆ Rn , let b be its barycenter under the Gaussian distribution. The following
lemma shows that if we have an estimate b0 of the barycenter, which is close to b but farther from the
origin, then shifting K to K − b0 , increases the Gaussian volume of K.

Lemma 6.1. Let b be the barycenter of K ⊆ Rn and b0 a point in Rn satisfying kb − b0 k2 ≤ δ /3 and
                                    2
kb0 k2 ≥ δ . Then, γn (K − b0 ) ≥ eδ /6 γn (K).

Proof. Let η = b − b0 .
                                  1 n/2          −kxk2 /2 dx           2
        γn (K − b0 )
                                            R
                               ( 2π )   x∈K−b0 e
                         =         1 n/2                           2
           γn (K)                             −kxk2 /2 dx
                                                R
                                ( 2π )   x∈K e
                                  1 n/2     R                  2
                                             −kyk2 /2−kb k2 /2+hy,b i dy     0 2          0
                               ( 2π )   y∈K e
                         =                 1 n/2                              2                    (change of variables y = x + b0 ).
                                                       −kxk2 /2 dx
                                                       R
                                        ( 2π )   x∈K e

Let Y be a random variable drawn from the n-dimensional Gaussian distribution restricted to the body K.
Then the right hand side above is equal to
           0 2           0            0 2                  0
      e−kb k2 /2 E[ehY,b i ] ≥ e−kb k2 /2 ehE[Y ],b i                                                (by Jensen’s inequality)
                                  −kb0 k22 /2+hb,b0 i              −kb0 k22 /2+hb0 ,b0 i+hη,b0 i
                             =e                            =e
                                 kb0 k22 /2+hη,b0 i                    2           2
                             =e                        = ekbk2 /2−kηk2 /2
                                         2                 2                 2
                             ≥ e(2δ /3) /2−(δ /3) /2 = eδ /6                                         (kbk2 ≥ kb0 k2 − kηk2 ≥ 2δ /3).


                             T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                                      30
        T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

      In our recentering algorithm we use the geometric language of Section 1.1.4. Instead of the vectors
v1 , . . . , vn and the shift t ∈ ∑ni=1 [−vi , vi ], we work directly with the parallelepiped P = ∑ni=1 [−vi , vi ] − t.
Notice that a facet of P corresponds to a fractional coloring with some coordinates fixed. Indeed, a facet
F of P is determined by a subset S ⊆ [n], and a coloring χ ∈ {−1, 1}S , and equals F = ∑i6∈S [−vi , vi ] +
∑i∈S χi vi − t. The size of the set S is equal to the co-dimension of F, so a vertex (face of dimension 0)
is equivalent to full coloring χ ∈ {−1, 1}n . The edges (faces of dimension 1) are linear segments that
have length exactly twice the length of the corresponding vectors. We say that P has side lengths at most
` if each edge of P has length at most `: this corresponds to requiring that maxi kvi k2 ≤ `/2. Given a
point p ∈ P, we denote by FP (p) the face of P that contains p and has minimal dimension. We denote by
WP (p) the subspace span(FP (p) − p).
      In this language, the (linear) discrepancy problem is translated to the problem of finding a vertex
of P inside K. The recentering problem can also be expressed in this way: we are looking for a point
p ∈ P ∩ K such that the Gaussian measure of (K − p) ∩WP (p), restricted to WP (p), is at least that of K, and
b((K − p) ∩WP (p)) is close to 0. To do this, we start out by approximating b = b(K), the barycenter of
K. If b is close to the origin, then we are already done and can return. If b is far from origin, then moving
the origin to b (i. e., shifting K and P to K − b, P − b respectively), should only help us by increasing the
Gaussian volume of K. But we cannot make this move if b lies outside P. In this case, we start moving
towards b; when we hit ∂ P, the boundary of P, we stop and induct on the facet we land on, choosing the
point on boundary of P we stopped on as our new origin. We show that even this partial move towards b
does not decrease the volume of K. Moreover, it ensures that the origin always stays inside P.
      One difficulty is that we cannot efficiently compute the barycenter of K exactly. To get around this,
we use random sampling from Gaussian distribution restricted to K to estimate the barycenter with high
accuracy. We will then return a shift of the body K such that its barycenter is δ -close to the origin, where
the running time is polynomial in n and (1/δ ) and it suffices to choose δ as inversely polynomial in n.
We assume that we have access to a membership oracle for the convex body K.
      The following theorem is an algorithmic version of Theorem 1.6. We note that the guarantees of the
algorithm are relatively robust. This is to make it simpler to use within other algorithms, since it may be
called on invalid inputs as well as output incorrectly with small probability.
Theorem 6.2. Let P be a parallelepiped in Rn containing the origin and K ⊆ Rn be a convex body
of Gaussian measure at least 1/2, given by a membership oracle, and let δ ≥ 0 and ε ∈ (0, 1). Then,
Algorithm 2 on these inputs either returns FAIL or a point p ∈ P ∩ K. Furthermore, if the input is correct,
then with probability at least 1 − ε, it returns p satisfying
   1. the Gaussian measure of (K − p) ∩WP (p) on WP (p), is at least that of K;
   2. kb((K − p) ∩WP (p))k2 ≤ δ .
Moreover, Algorithm 2 runs in time polynomial in n, 1/δ and ln(1/ε).
Proof. Firstly, it easy to check by induction, that at the beginning of each iteration of the for loop that
                    q ∈ P ∩ K,     W = WP (q),       K̄ = (K − q) ∩WP (q),       P̄ = FP (q) − q .               (6.4)
    To prove correctness of the algorithm, we must show that the algorithm returns a point q satisfying
the conditions of Theorem 6.2 with probability at least 1 − ε.

                         T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                      31
          DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

Algorithm 2 Recentering Procedure
 1: Input: Convex body K ⊆ Rn with γn (K) ≥ 1/2, an n-dimensional parallelepiped P 3 0, δ ≥ 0 and
    error probability ε ∈ (0, 1).
 2: Output: See statement of Theorem 6.2.
 3: If 0 6∈ P ∩ K, return FAIL.
 4: Set N = d24/δ 2 e + n.
 5: Set q = 0, W = WP (0), K̄ = K ∩W , P̄ = FP (0).
 6: for i = 1, . . . , N do
 7:     Compute an estimate b0 of the barycenter b of K̄ restricted to the subspace W ,
           satisfying kb − b0 k2 ≤ δ /6 with probability at least 1 − ε/N.
           If b0 6∈ K̄, return FAIL, otherwise continue.
 8:     if kb0 k2 ≤ δ /2 then Return q.
 9:     else if kb0 k2 > δ /2 and b0 6∈ P̄ then
10:           Compute λ ∈ (0, 1) be such that λ b0 ∈ ∂ P̄ relative to W .
11:           Set s = λ b0 .
12:     else
13:           Set s = b0 .
14:     end if
15:     Set q = q + s.
16:     Set W = WP̄ (s), P̄ = FP̄ (s) − s, K̄ = (K̄ − s) ∩W .
17: end for
18: Return FAIL.



    For this purpose, we shall condition on the event that all the barycenter estimates computed on line 7
are within distance δ /6 of the true barycenters, which we denote by E. Since we run the barycenter
estimator at most N times, by the union bound, E occurs with probability at least 1 − ε. We defer the
discussion of how to implement the barycenter estimator till the end of the analysis.
    With this conditioning, we prove a lower bound on the Gaussian mass as a function of the number of
iterations, which will be crucial for establishing the correctness of the algorithm.

Claim 6.3. Let W, K̄, P̄ denote the state after t ≥ 0 non-terminating iterations. Let kt ≥ 0 denote number
of iterations before time t, where the dimension of W decreases. Then, conditioned on E, we have that
                                                            2
                                         γW (K̄) ≥ e(t−kt )δ /24 γn (K) .

Proof. We prove the claim by induction on t. At the base case t = 0 (i. e., at the beginning of the first
iteration), note that kt = 0 by definition. If W = Rn , the inequality clearly holds since K̄ = K. If W ⊂ Rn ,
then since γn (K) ≥ 1/2 by Lemma 2.6, we have γW (K̄) ≥ γn (K). The base case holds thus holds.
     We now assume that the bound holds at time t and prove it for t + 1, assuming that iteration t + 1 is
non-terminating. Let b,b0 ,s denote the corresponding loop variables, and W 0 , K̄ 0 , P̄0 denote the new values
of W, K̄, P̄ after line 16.
     Since the iteration is non-terminating, we have that kb0 k2 > δ /2. Since by our conditioning kb0 −

                       T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                 32
        T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

bk2 ≤ δ /6, by Lemma 6.1 and the induction hypothesis, we have that
                                                 2                           2
                              γW (K̄ − b0 ) ≥ eδ /24 γW (K̄) ≥ e(t+1−kt )δ /24 γn (K) .                       (6.5)

Note that we drop in dimension going from W to W 0 if and only if s lies on the boundary of P̄ relative to
W (since then the minimal face of P̄ containing s is lower dimensional).
    We now examine two cases. In the first case, we assume b0 is in the relative interior of P̄. In this case,
we have s = b0 , and hence W = W 0 and K̄ 0 = K̄ − b0 . Given this, kt+1 = kt (no drop in dimension) and the
desired bound is derived directly from equation (6.5).
    In the second case, we assume that b0 is not in the interior of P̄ relative to W . In this case, s = λ b0 ∈ ∂ P̄
relative to W , for some λ ∈ [0, 1]. Furthermore, W 0 ⊂ W and kt+1 = kt + 1. From here, by Ehrhard’s
inequality and equation (6.5), we get that

                   γW (K̄ − s) ≥ Φ((1 − λ )Φ−1 (γW (K̄)) + λ Φ−1 (γW (K̄ − b0 ))) ≥ γW (K̄)
                                          2                            2                                      (6.6)
                               ≥ e(t−kt )δ /24 γn (K) = e(t+1−kt+1 )δ /24 γn (K) ≥ 1/2 .

Lastly, by Lemma 2.6 since γW (K̄ − s) ≥ 1/2, we have that

                                      γW 0 (K̄ 0 ) = γW 0 (K̄ − s) ≥ γW (K̄ − s) .                            (6.7)

The desired bound now follows combining Equations (6.6),(6.7).

    We now prove correctness of the algorithm conditioned on E. We first show that conditioned on E,
the algorithm returns q from line 8 during some iteration of the for loop. For the sake of contradiction,
assume instead that the algorithm returns FAIL. Let W, K̄, P̄ denote the state after the end of the loop.
Then, by Claim 6.3, we have that
                                           2                       2
                      γW (K̄) ≥ e(N−kN )δ /24 γn (K) ≥ e(N−n)δ /24 γn (K) ≥ eγn (K) > 1,

where we used that fact kN ≤ n, since dimension cannot drop more than n times. This is clear contradiction
however, since Gaussian measure is always at most 1.
     Given the above, we can assume that the algorithm returns q during some iteration of the for loop.
Let W, K̄, P̄, b0 denote the state at this iteration. Since we return at this iteration, we must have that
kb0 k2 ≤ δ /2. Given E, we have that the barycenter b of K̄ satisfies

                                 kbk2 ≤ kb0 k2 + kb − b0 k2 ≤ δ /2 + δ /6 < δ .

By Claim 6.3, we also know that γW (K̄) ≥ γn (K). Since by equation (6.4), q ∈ P and K̄ = (K − q) ∩WP (q),
the correctness of the algorithm follows.
    For the running time we note that it is dominated by the N = O(1/δ 2 + n) calls to the barycenter
estimator. Thus, as long as the estimator runs in poly(n, ln(1/(εδ ))) time, the desired bound on the
running time holds.
    It remains to show that we can estimate the barycenter efficiently. We show how to do this
in appendix in Theorem 9.3 with failure probability at most ε/N in time poly(n, 1/δ , ln(N/ε)) =
poly(n, 1/δ , ln(1/ε)), as needed.

                        T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                    33
           DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

6.4    Algorithmic reduction from asymmetric to symmetric Banaszczyk
In this subsection, we make algorithmic the reduction in Section 6.2 from the asymmetric to the symmetric
case. This will directly imply that given an algorithm to return a vertex of P contained in a symmetric
convex body K of Gaussian volume at least 1/2, we can also efficiently find a vertex of P contained in an
asymmetric convex body of Gaussian measure at least a half.

Definition 6.4 (Symmetric Body Coloring Algorithm). We shall say that A is a symmetric body coloring
algorithm, if given as input an n-dimensional parallelepiped P 3 0 of side lengths at most lA (n), lA a
non-negative non-increasing function of n, and a symmetric convex body K ⊆ Rn satisfying γn (K) ≥ 1/2,
given by a membership oracle, it returns a vertex of P contained in K with probability at least 1/2.
                    √
    Let α = 4(1 + π 8 ln 2). We now present an algorithm, which uses A as a black box and achieves
the same guarantee for asymmetric convex bodies, with only a constant factor loss in the length of the
vectors.

Algorithm 3 Reducing asymmetric convex bodies to symmetric convex bodies
 1: Input: Algorithm A as in Definition 6.4, K ⊆ Rn convex body, given by membership oracle, with
      γn (K) ≥ 1/2, P 3 0 an n-dimensional parallelepiped of side lengths at most lA (n)/α.
 2: Output: A vertex v of P contained in K.
                                                          1
 3: Call Recentering Procedure on K and P and δ =         √   and ε = 1/4.
                                                        32 2π
      Restart from line 3 if the call outputs FAIL, and otherwise let q be the output.
 4: Call A on α(FP (q) − q) and α[(K − q) ∩ (q − K)] ∩WP (q) inside WP (q).
      Let v be the output.
 5: If v/α + q ∈ K and is a vertex of P, return v/α + q. Else, restart from line 3.



Theorem 6.5. Algorithm 3 outputs a vertex v of P contained in K, and runs in expected polynomial time.

Proof. Clearly, by line 5 correctness is trivial, so we need only argue that it runs in expected polynomial
time. Since the running time of the recentering procedure (Algorithm 2) is polynomial, and the runs
are independent, we need only argue that line 5 accepts with constant probability. Since the recentering
procedure outputs correctly with probability at least 1 − ε = 3/4, we may condition on the correctness of
the output q in line 3.
    Under this conditioning, by the guarantees of the recentering algorithm, letting d = dim(FP (q)),
W = WP (q) and C = (K − q) ∩W , we have that

                                                                      1
                                γW (C) ≥ 1/2     and    kb(C)k2 ≤     √ .
                                                                    32 2π
Thus by Lemma 9.9, for X ∈ W the d-dimensional standard Gaussian on W , we have that
                                                             √
                           E[kXkC∩−C ] ≤ 2 E[kXkC ] ≤ 2(1 + π 8 ln 2) .
                                                                √
Hence by Markov’s inequality, Pr[X ∈ α(C ∩ −C)] = Pr[X ∈ 4(1 + π 8 ln 2)(C ∩ −C))] ≥ 1/2.

                        T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                           34
        T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

    Now by construction αP has side lengths at most lA (n), and hence α(FP (q) − q) also has side lengths
at most lA (n) ≤ lA (d). Thus, A on input α(FP (q) − q) and α(C ∩ −C), outputs a vertex v of α(FP (q) − q)
contained in α(C ∩ −C) ⊆ α(K − p) ∩ W with probability at least 1/2. Hence, the check in line 5
succeeds with constant probability, as needed.

   As we saw in Lemma 3.2, a symmetric body coloring algorithm is implied by a subgaussian sampler.
Together with Theorem 6.5, this gives us the following corollary.

Corollary 6.6. Suppose that s(n) is a positive non-decreasing function such that the following holds. For
any v1 , . . . , vm ∈ Rn with maxi∈[n] kvi k2 ≤ 1, and for any q ∈ P = ∑mi=1 [−vi , vi ], there exists a distribution
         m
D on ∑i=1 {−vi , vi } such that for v ∼ D, v − q is s(n)-subgaussian, and v ∈ FP (q). Then for any t ∈ P
there exists a q ∈ K, computable in expected polynomial time by a randomized algorithm, such that, for
v ∼ D,
                                                                       1
                                           Pr[v − t ∈ 3α ·CT · s(n)K] ≥ .
                                                                       2
Proof. Suppose we are given a parallelepiped P0 3 0 with side lengths at most l(n) = 2(3αCT s(n))−1 .
We can write P0 as
                                        `(n) m               `(n)
                                  P0 =       ∑ [−vi , vi ] −      q,
                                         2 i=1                2

where q ∈ ∑mi=1 [−vi , vi ] and the vectors v1 , . . . , vm satisfy kvi k2 ≤ 1. By Lemma 3.2 part 1, the algorithm
A which samples v from the distribution D on ∑m           i=1 {−vi , vi }, and returns (`(n)/2)(v − q) is a symmetric
body coloring algorithm with `A (n) = `(n). Then the corollary follows from Theorem 6.5.

    Theorem 1.8 now follows directly from Corollary 6.6 and Theorem 1.4.


7    Body-centric algorithm for asymmetric convex bodies
In this section, we give the algorithmic implementation of the extended recentering procedure, which
returns full colorings matching the guarantees of Theorem 1.8. Interestingly, the coloring output by the
procedure will be essentially deterministic. The only randomness will be in effect due to the random
errors incurred in estimating barycenters.
    For a convex body K ⊆ Rn , unit vector θ ∈ Rn ,kθ k2 = 1, and v ∈ R, we define the shifted slice
Kv = (K − vθ ) ∩ {x ∈ Rn : hθ , xi = 0}. The main technical estimate we will require in this section, is the
  θ

following lower bound on the gaussian measure of shifted slices. We defer the proof of this estimate to
Section 8.

Theorem 7.1. There exists universal constants v0 , η0 , c0 > 0, such that for any n ≥ 1, convex body
K ⊆ Rn satisfying kb(K)k2 = η ≤ η0 and γn (K) = α ≥ 3/5, v ∈ [−v0 , v0 ] and θ ∈ Rn , kθ k2 = 1, we
have that
                                                              − 12
                                                                    !
                                                            e  100v
                             γn−1 (Kvθ ) ≥ (α − c0 η) 1 − √           .
                                                             4 2π


                        T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                     35
           DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

    The above inequality says that if barycenter of K is close to the origin, then the Gaussian measure of
parallel slices of K does not fall off too quickly as we move away from the origin.
    Recall that the problem can be recast as finding a vertex of a parallelepiped P contained inside the
convex body K, where the parallelepiped P = ∑ni=1 [−vi , vi ] − t and t ∈ ∑ni=1 [−vi , vi ]. Thus, 0 ∈ P.
    We start out by calling the recentering procedure to get the barycenter, b, close to the origin. This
recentering allows us to rescale K by a constant factor such that the Gaussian volume of K increases i. e.,
                                                      √             √
we replace P by β P and K by β K where β = 1 + π 8 log 2 + 4π log 2 is chosen such that the volume
of K after rescaling is at least 3/4. Then we find a point q∗ on ∂ P, the boundary of P which is closest to
the origin. We recurse by taking a (n − 1)-dimensional slice of K(here we abuse notation by calling the
convex body after rescaling as also K) with the facet containing q∗ . A crucial point here is that we choose
q∗ as the origin of the (n − 1)-dimensional space we use in the induction step. This is done to maintain
the induction hypothesis that the parallelepiped contains the origin. Theorem 7.1 guarantees that in doing
so, we do not lose too much Gaussian volume.

Algorithm 4 Body-centric algorithm for general convex bodies
 1: Input: Convex body K ⊆ Rn , given by a membership oracle, with γn (K) ≥ 1/2, an n-dimensional
      parallelepiped P 3 0 of side lengths at most 2αn .
 2: Output: A vertex of P contained in K.
                                                                                        1
 3:  Call Recentering Procedure on K and P with parameters δ = ηn and ε = 2(n+1)          .
           Restart from line 3 if the call outputs FAIL, and otherwise let q denote the output.
 4: Set q̄ = 0, W̄ = WP (q), K̄ = β ((K − q) ∩ W̄ ), P̄ = β (FP (q) − q).
 5: repeat
                                                                                    1
 6:      Call Recentering Procedure on P̄ and K̄ with δ = ηn and ε = 2(n+1)           .
           Restart from line 3 if the call outputs FAIL, and otherwise let s denote the output.
 7:      Set q̄ = q̄ + s, W̄ = WP̄ (s), K̄ = (K̄ − s) ∩ W̄ , P̄ = (FP̄ (s) − s).
 8:      if dim(W̄ ) 6= 0 then
 9:           Compute s ∈ argmin {kpk2 : p ∈ ∂ P̄ relative to W̄ }.
10:           Set q̄ = q̄ + s, W̄ = WP̄ (s), K̄ = (K̄ − s) ∩ W̄ , P̄ = (FP̄ (s) − s).
11:      end if
12: until dim(W̄ ) = 0
13: If q + q̄/β ∈ K and is a vertex of P, return q + q̄/β . Else, restart from line 3.



Lemma 7.2. Given a convex body K in Rn such that
                                                                         1
                              γn (K) ≥ 1/2      and        kb(K)k2 ≤     √ ,
                                                                       32 2π
                                    √            √
then γn (β K) ≥ 3/4, where β = 1 + π 8 log 2 + 4π log 2.
                                                                                     √
Proof. Let X be the standard n-dimensional Gaussian. From Lemma 9.9, E[kXkK ] ≤ 1 + π 8 log 2. This
gives
                      Pr[kXkK > β ] = Pr[kXkK − E[kXkK ] > β − E[kXkK ]]
                                                                    p
                                    ≤ Pr[kXkK − E[kXkK ] > β − 1 − π 8 log 2].

                        T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                            36
        T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

    By Lemma 9.6 and Lemma 9.7, the function k.kK is 4−Lipschitz. Then, by Theorem 9.8
                                                                        √
                                                          − 2 ( β −1−π4 8 log 2 )2
                                     Pr[kXkK > β ] ≤ e π 2                           = 1/4 .

Thus, γn (β K) = Pr[X ∈ β K] = 1 − Pr[kXkK > β ] ≥ 1 − 1/4 = 3/4, as needed.

    For Algorithm 4, we use the parameters
                          (                )                                                      
                                     1                                                  1     1
                 αn = min v0 , p             ,                ηn = min η0 ,             √ ,          ,
                                10 log(2n)                                            32 2π 14c0 n

where v0 , η0 , c0 are as in Theorem 7.1. We now give the formal analysis of the algorithm.
   We begin by explaining how to compute the minimum norm point on the boundary of a parallelepiped.
Lemma 7.3. Let P = ∑ki=1 [−vi , vi ] − t ⊂ Rn be a parallelepiped of side lengths at most 2α containing
the origin, with v1 , . . . , vk linearly independent. Let v∗1 , . . . , v∗k denote the dual basis, i. e., the unique set of
vectors lying inside W := span(v1 , . . . , vk ) satisfying hv∗i , v j i = 1 if i = j and 0 otherwise, and let s denote
an element of minimum `2 norm in the set
                                              ±1 + hv∗i ,ti v∗i
                                                                                 
                                                           ·            : i  ∈ [k]  .
                                                kv∗i k2      kv∗i k2
Then, the following hold.
   1. s ∈ argmin {kpk : p ∈ ∂ P relative to W }.
   2. WP (s) ⊆ {x ∈ W : hs, xi = 0}.
   3. ksk2 ≤ α.
Furthermore, s can be computed in polynomial time.
Proof. Note that for any x ∈ W , we have that x = ∑ki=1 hv∗i , xivi . From here, given that P = ∑ki=1 [−vi , vi ]−t,
it is easy to check that

                           P = {x ∈ W : −1 + hv∗i ,ti ≤ hv∗i , xi ≤ 1 + hv∗i ,ti, ∀i ∈ [k]} .                         (7.1)

    We now show that s ∈ argmin {kpk2 : p ∈ ∂ P relative to W }. Since 0 ∈ P, we must show that x ∈ P
if kxk2 ≤ ksk2 and x ∈ W , and that s ∈ ∂ P relative to W . Given that the vectors v∗i /kv∗i k2 have unit `2
norm, the norm of s is equal to
                                              | ± 1 + hv∗i ,ti|
                                                                        
                                 ω := min                       : i ∈ [k] .
                                                   kv∗i k2
Now assume x ∈ W and kxk2 ≤ ω. Since by assumption 0 ∈ P, we must have −1 +hv∗i ,ti ≤ 0 ≤ 1 +hv∗i ,ti,
∀i ∈ [k]. Therefore, for i ∈ [k], by Cauchy-Schwarz

                                        hv∗i , xi ≤ kv∗i k2 · ω ≤ 1 + hv∗i ,ti,
                                        hv∗i , xi ≥ −kv∗i k2 · ω ≥ −1 + hv∗i ,ti .


                          T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                          37
          DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

Hence x ∈ P, as needed. Next, we must show that s ∈ ∂ P relative to W . Firstly, clearly s ∈ W since each
v∗1 , . . . , v∗k ∈ W , and thus by the above argument s ∈ P. Now choose i ∈ [k], r ∈ {−1, 1} such that

                                                  r + hv∗i ,ti v∗i
                                             s=               · ∗ .
                                                    kv∗i k2    kvi k2

Then, by a direct calculation hv∗i , si = r + hv∗i ,ti, and hence s satisfies one of the inequalities of P (see
Equation (7.1)) at equality. Thus, s ∈ ∂ P relative to W (note that P is full-dimensional in W ), as needed.
    We now show that WP (s) ⊆ {x ∈ W : hs, xi = 0}. By the above paragraph, every element x of the
minimal face FP (s) of P containing s satisfies hv∗i , xi = r + hv∗i ,ti = hv∗i , si. In particular hv∗i , x − si = 0.
Since s is collinear with v∗i (s may be 0), we have hv∗i , x − si = 0 ⇒ hs, x − si = 0. The claim now follows
since WP (s) is the span of FP (s) − s and FP (s) − s ⊆ {x ∈ W : hs, xi = 0} by the previous statement.
    We now show that ksk2 ≤ α. Firstly, by minimality of s, note that |r + hv∗i ,ti| ≤ 1, for r and i as
above. Thus, by Cauchy-Schwarz,

                                 r + hv∗i ,ti    1       hvi , v∗i i kvi k2 kv∗i k2
                        ksk =                 ≤        =            ≤               = kvi k .
                                   kv∗i k2      kv∗i k    kv∗i k2       kv∗i k2

Since P has side lengths at most 2α, we have kvi k ≤ α. Thus, ksk2 ≤ α, as claimed.
       We now prove the furthermore. Let V denote the matrix whose columns are v1 , . . . , vk . By linear
independence of v1 , . . . , vk , the matrix V TV is invertible. Since then (V (V TV )−1 )TV = Ik , we see that
v∗1 , . . . , v∗k are the columns of V (V TV )−1 (note that these lie in W by construction), and hence can be
constructed in polynomial time. Since s can clearly be constructed in polynomial time from the dual basis
and t, the claim is proven.

Theorem 7.4. Algorithm 4 is correct and runs in expected polynomial time.

Proof. Clearly, by the check on line 13, correctness is trivial. So we need only show that the algorithm
terminates in expected polynomial time. In particular, it suffices to show the probability that a run of the
algorithm terminates without a restart is at least 1/2.
    For this purpose, we will show that the algorithm terminates correctly conditioned on the event that
each call to the recentering procedure terminates correctly, which we denote E. Later, we will show that
this event occurs with probability at least 1/2, which will finish the proof.
    Let W1 , K1 , P1 denote the values W̄ , K̄, P̄ after line 4.
    During the iterations of the repeat loop, it is easy to check by induction that after the execution of
either line 7 or 10, the variables q̄, W̄ , K̄, P̄ satisfy:

                    q̄ ∈ P1 ,   W̄ = WP1 (q̄),   K̄ = (K1 − q̄) ∩WP1 (q̄),     P̄ = FP1 (q̄) − q .              (7.2)

   We now establish the main invariant of the loop, which will be crucial in establishing correctness
conditioned on E:
Claim 7.5. Let W̄ , K̄, P̄ denote the state after k ≥ 0 successful iterations of the repeat loop. Then, the
following hold:

   1. dimW ≤ n − k.

                         T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                    38
        T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

   2. Conditioned on E, γW̄ (K̄) ≥ 3/4 − 7n
                                          k
                                            > 3/5.


Proof. We prove the claim by induction on k.
     For k = 0, the state corresponds to W1 , K1 and P1 . Trivially, dim(W1 ) ≤ n = n − k, so the first condition
holds. Conditioned on E, we have that K1 /β has Gaussian   √         mass at least 1/2 restricted to W1 and its
barycenter has `2 norm at most ηn . Since ηn ≤ 1/(32 2π) by Lemma 7.2, we have that γW1 (K1 ) ≥ 3/4.
Thus, the second condition holds as well.
     We now assume the statement holds after k iterations, and show it holds after iteration k + 1, assuming
that we don’t terminate after iteration k and that we successfully complete iteration k + 1. Here, we
denote the state at the beginning of iteration k + 1 by W̄ , K̄, P̄, after line 7 by W̄1 , K̄1 , P̄1 and at the end the
iteration by W̄2 , K̄2 , P̄2 .
     We first verify that W̄2 ≤ n − (k + 1). By the induction hypothesis n − k ≥ dim(W̄ ) and by construction
W̄2 ⊆ W̄1 ⊆ W̄ . Thus, we need only show that dim(W̄2 ) < dim(W̄ ). Given that we successfully complete
the iteration, namely the call to the recentering algorithm on line 6 doesn’t return FAIL, we may distinguish
two cases. Firstly, if dim(W̄2 ) = 0, then we must have dim(W̄2 ) < dim(W̄ ), since otherwise dim(W̄ ) = 0
and the loop would have exited after the previous iteration. Second if dim(W̄2 ) > 0, we must have entered
the if statement on line 8 since dim(W̄2 ) ≤ dim(W̄1 ). From here, we see that dim(W̄2 ) corresponds to the
dimension of the minimal face of P̄1 containing s. Since s is on the boundary of P̄1 relative to W̄1 , we
get that dim(W̄2 ) < dim(W̄1 ) ≤ dim(W̄ ), as needed. Thus, condition 1 holds at the end of the iteration as
claimed.
     We now show that conditioned E, γW̄2 (K̄2 ) ≥ 3/4 − (k + 1)/(7n). By the induction hypothesis,
recall that γW̄ (K̄) ≥ 3/4 − k/(7n), thus it suffices to prove that γW̄2 (K̄2 ) ≥ γW̄ (K̄) − 1/(7n). Note that
since we decrease dimension at every iteration (as argued in the previous paragraph), the number of
iterations of the loop can never exceed n. Thus, after any valid number of iterations l, we always have
3/4 − l/(7n) ≥ 3/4 − 1/7 > 3/5. In particular, we have γW̄ (K̄) ≥ 3/5.
     We now track the change in Gaussian mass going from K̄ to K̄2 . Since the recentering procedure on
line 6 terminates correctly by our conditioning on E, we get that

                                  γW̄1 (K̄1 ) ≥ γW̄ (K̄)   and   kb(K̄1 )k2 ≤ ηn .

If dim(W̄1 ) = 0, then clearly W̄2 = W̄1 and K̄2 = K̄1 , and hence γW̄2 (K̄2 ) ≥ γW̄ (K̄) as needed. If dim(W̄1 ) >
0, we enter the if statement at line 8. Since P̄1 is a parallelepiped containing 0 of side length 2αn , by
Lemma 7.3 we have that ksk ≤ αn and W̄2 = WP̄1 (s) ⊆ Hs where Hs := {x ∈ W̄1 : hs, xi = 0}. Now if
s = 0, then K̄2 = K̄1 ∩ W̄2 , and thus by Lemma 2.6, we have γW̄2 (K̄2 ) ≥ γW̄1 (K̄1 ) ≥ γW̄ (K̄), as needed. If
s 6= 0, given that ksk ≤ αn ≤ v0 , kb(K̄1 )k2 ≤ ηn ≤ η0 and γW̄1 (K̄1 ) ≥ 3/5, by applying Theorem 7.1 on
K̄1 with v = ksk2 and θ = s/v, we get that
                                                                                  
                                                                            − 1 2
                                                                          e  100αn
                       γHs ((K̄1 − s) ∩ Hs ) ≥ (γW̄1 (K̄1 ) − c0 ηn ) 1 − √ 
                                                                          4 2π
                                                                                                              (7.3)
                                                                        1
                                                                    −
                                                                  e 100αn2             1
                                              ≥ γW̄ (K̄) − c0 ηn − √       ≥ γW̄ (K̄) − .
                                                                  4 2π                 7n

                         T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                     39
         DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

Since K̄2 = (K̄1 − s) ∩ W̄2 , W̄2 ⊆ Hs and γHs ((K̄1 − s) ∩ Hs ) ≥ 3/4 − (k + 1)/(7n) > 1/2, by Lemma 2.6

                                      γW̄2 (K̄2 ) ≥ γHs ((K̄1 − s) ∩ Hs ).                            (7.4)

The desired estimate follows combining (7.3) and (7.4).

    By Claim 7.5, we see that the number of iterations of the repeat loop is always bounded by n.
Furthermore, conditioned on E, the loop successfully terminates with W̄ ,K̄,P̄ satisfying γW̄ (K̄) > 0 and
dim(W̄ ) = 0. Since dim(W̄ ) = 0, this implies that W̄ = K̄ = {0}. Furthermore, by Equation (7.2), this
implies that q̄ ∈ K1 ∩ P1 and dim(WP1 (q̄)) = 0, and hence q̄ is a vertex of P1 . Since K1 = β (K − q) and
P1 = β (P − q), we get that q + q̄/β is a vertex of P contained in K, as needed. Thus, conditioned on E,
the algorithm returns correctly.
    To lower bound E, by the above analysis, note that we never call the recentering procedure more than
n + 1 times, i. e., once on line 3 and at most n times on line 6. By the union bound, the probability that
one of these calls fails is at most (n + 1) · 1/(2(n + 1)) = 1/2. Thus, E occurs with probability at least
1/2, as needed.


8    An estimate on the Gaussian measure of slices
In this section, we prove Theorem 7.1. We will need the following estimate on Gaussian tails [1,
Formula 7.1.13].

Lemma 8.1 (Gaussian tail bounds). Let X ∼ N(0, 1). Then for any t ≥ 0,
                          r          2                          r               2
                              2 e−t /2                               2    e−t /2
                                   √       ≤ Pr[X ≥ t] ≤                  p          .
                              π t + t2 + 4                           π t + t 2 + 8/π

    Before proving Theorem 7.1, we first prove a similar result for a special class of convex bodies in R2 .
    We define a convex body K in R2 to be downwards closed if (x, y) ∈ K implies (x, y0 ) ∈ K for all
y0 ≤ y. For notational convenience, we shall denote the first and second coordinate of a vector in R2
respectively as the x and y coordinates. We shall say the slice of K at x = t or y = t to denote either the
vertical slice of K having x-coordinate t or horizontal slice having y-coordinate t. We define the height of
K at x = t to be maximum y-coordinate of any point (t, y) ∈ K. By convention, we let the height of K at
x = t be −∞ if K does not a contain a point with x-coordinate t.

Lemma 8.2. Let K ⊆ R2 be a downwards closed convex body with γ2 (K) = α ≥ 1/2 and barycenter
b = b(K) satisfying b1 ≥ 0, and let g = Φ−1 (α) ≥ 0. Then, there exists a universal constant v0 > 0 such
that for all 0 ≤ v ≤ v0 , the height of K at x = v is at least

                                                                               v2
                                                                                   
                                                        g2     1
                                f (v, g) := g − min e    2 − 8ev2   , (4e + 2)          .
                                                                               g

Proof.

                      T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                              40
        T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

Step 1: Reduction to a wedge. We first show that the worst-case bodies for the lemma are “wedge-
shaped” (see the illustration in Figure 1). Namely, the worst case down closed convex bodies are of
form
       (x, y) ∈ R2 : x ≥ −c, sx + ty ≤ d    where d, s,t ≥ 0, s2 + t 2 = 1, c ∈ R≥0 ∪ {∞} .
      
                                                                                              (8.1)
More precisely, we will show that given any body K satisfying the conditions of the theorem, there exists
a wedge W satisfying the conditions of the theorem whose height at x = v is at most that of K.
     Let K ⊆ R2 satisfy the conditions of the theorem. We first show that K contains a point on the line
at x = v. If not, we claim that K has Gaussian mass at most γ1 (−v, v) ≤ γ1 (−v0 , v0 ) < 1/2 by choosing
v0 small enough, a clear contradiction. To see this, note that by pushing the mass of K to the right as
much as possible towards the line x = v, we can replace K by a band [a, v] × R with the same Gaussian
mass, and barycenter to the right of b(K). Clearly, such a band has barycenter to the right of the y-axis iff
a ≥ −v, and hence K has Gaussian mass at most γ2 ([−v, v] × R) = γ1 (−v, v), as needed.
     Now assume that K has height at least g at x = v, where we recall that g = Φ−1 (α). Note now that
the band W = R × (−∞, g] (corresponding to s = 0,t = 1, d = g, c = ∞) has height at x = v at most that
of K and satisfies the conditions of the theorem. Thus, we may assume that the height of K at x = v is f ,
where −∞ < f < g. Note that (v, f ) is now a point on the boundary of K.
     Let g0 denote the height of K at x = 0. Since γ2 (K) ≥ 1/2, by Lemma 2.6 we have that γ1 (∞, g0 ) ≥
γ2 (K), and hence g0 ≥ g ≥ 0. Thus g0 ≥ g > f , and hence v > 0 (since otherwise we would have f = g0 ).
By convexity of K, we may choose a line ` tangent          to K passing through (v, f ). We may now choose
t ≥ 0, s, d ∈ R, such that s2 + t 2 = 1 and ` = (x, y) ∈ R2 : sx + ty = d . Since K is downwards-closed,
                                                   

t ≥ 0 and ` is tangent to K, we must have that K ⊆ H` := {(x, y) : sx + ty ≤ d}. Since 0 is below (0, g0 ) ∈ K,
we have that 0 ∈ H` , and hence d ≥ 0. Given the (0, g0 ) ⊆ H` , we have that tg0 ≤ d, and, because ` is
tangent at (v, f ), also sv + t f = d; using that v > 0and g0 > f , we conclude that s > 0.
     We will now show that the wedge W = H` ∩ (x, y) ∈ R2 : x ≥ −c satisfies our requirements for
an appropriate     choice of c (note the conditions   for s,t, d are already satisfied by the above paragraph).
Let B−                   2 : x ≤ v and B+ = (x, y) ∈ R2 : x ≥ v . Choose c ≥ −v such that γ (W ∩ B− ) =
                                              
      v  =    (x, y) ∈ R                  v                                                        2      v
γ2 (K ∩ B− v ). Note   that such  a c must  exist since K  ⊆   H ` . Now  by construction,  note that W has the
same height as K at x = v, so it remains to check that c ≥ 0, γ2 (W ) ≥ 1/2 and b(W )1 ≥ 0. To bound the
Gaussian mass, again by construction, we have that
                     γ2 (W ) = γ2 (W ∩ B−              +              −              +
                                        v ) + γ2 (W ∩ Bv ) = γ2 (K ∩ Bv ) + γ2 (W ∩ Bv )
                            ≥ γ2 (K ∩ B−              +
                                       v ) + γ2 (K ∩ Bv ) = γ2 (K) ≥ 1/2 .

Given that γ2 (W ) ≥ 1/2, we must have that 0 ∈ W and hence c ≥ 0, as needed. It remains to check that
b(W )1 ≥ 0. For this purpose, note first that we can transform K ∩ B−           −
                                                                    v into W ∩ Bv by only pushing mass
to the right, and hence
                             Z                                   Z
                                          −(x+y)2 /2                              2
                                     xe                dy dx ≤           xe−(x+y) /2 dy dx .             (8.2)
                              K∩B−
                                 v                               W ∩B−
                                                                     v

Since v ≥ 0 and K ∩ B+        +
                     v ⊆ W ∩ Bv , we also immediately get that
                             Z                                   Z
                                                2                                 2

                                  +
                                    xe−(x+y) /2 dy dx ≤                  xe−(x+y) /2 dy dx .             (8.3)
                              K∩Bv                               W ∩B+
                                                                     v



                       T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                41
          DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV




                         Figure 1: B is the triangular region beneath the red line


We derive b(W )1 ≥ 0 by combining (8.2), (8.3) and our assumption that b(K)1 ≥ 0.
 Given the     above reduction, we may now assume that K is a wedge of the form
  (x, y) ∈ R2 : x ≥ −c, sx + ty ≤ d , d, s,t ≥ 0, s2 + t 2 = 1, as in Equation (8.1). We first take care of
some trivial cases. Firstly, if s = 0,t = 1, we have K = [−c, ∞) × (−∞, d]. Then the height at x = v is
clearly d, and since γ1 (−∞, d) ≥ γ2 (K), we get d ≥ g as is needed. Now assume that s = 1,t = 0, then
K = [−c, d] × R, and hence the height at x = v is infinite (note that K always intersects the line at x = v
by the first part), and thus the desired bound trivially holds. We may thus assume that both s,t > 0.
    In this setting, the line ` = (x, y) ∈ R2 : sx + ty = d intersects the x-axis at a = d/s and forms an
                                  

angle θ ∈ (0, π/2) with the x-axis as in Figure 1. Given the normalization s2 + t 2 = 1, note that d is the
perpendicular distance from 0 to the edge K ∩ ` of K. In what follows, we maintain the parametrization
of the wedge K in terms of a,θ ,c, using the relations d = a sin θ , s = sin θ and t = cos θ to recover d, s,t
when needed.
    Recall that γ2 (K) ≥ α = γ1 (−∞, g) and b(K)1 ≥ 0. Let f ∗ = f (v, g) and let f be the height of K at
x = v. We want to prove that f ≥ f ∗ . If f ≥ g, we are already done since f ∗ ≤ g. Note that by Lemma 2.6,
f ≥ g if v = 0 and hence we may assume v > 0. Our goal is now to show that

                                                                           v2
                                                                               
                                                    g2     1
                                    g − f ≤ min e    2 − 8ev2   , (4e + 2)          .
                                                                           g


Step 2: Using the barycenter condition. We now derive a bound on how large d must be, given c and
θ , such that the x-coordinate of barycenter is non-negative.
     Let H denote the halfspace (x, y) ∈ R2 : sx + ty ≤ d and let E = H \ K. A simple computation
                                   

shows
                                                                         2
                                    1                     e−d /2
                                Z
                                           2  2
                                  x e−1/2(x +y ) dy dx = − √     sin θ .
                                 H 2π                       2π

                       T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                               42
        T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

Now from the fact that b(K)1 ≥ 0, and E = H \ K, we get
                  2                                                                         2
             e−d /2                  1                              1              e−c /2
                              Z                              Z
                                            2  2                            2
            − √     sin θ ≥        x e−1/2(x +y ) dy dx ≥          √ xe−1/2x dx = − √ .                 (8.4)
               2π                 E 2π                        x≤−c  2π               2π
                              1   2     2
   It follows that sin θ ≤ e− 2 (c −d ) . When c2 ≥ −2 log sin θ , we get the following bound on d:
                                               p
                                           d ≥ c2 + 2 log sin θ .

   When c2 < −2 log sin θ , this gives no useful bound on d since in that case the barycenter is non-
negative even for d = 0. But d ≥ 0 always as the Gaussian measure of K is at least half. Thus,
                                       q
                                    d ≥ max{0, c2 + 2 log sin θ } .                             (8.5)

Step 3: Getting a bound on f . By construction of K, the point (v, f ) lies on the boundary of H, and
hence
                              vs + f t = v sin θ + f cos θ = a sin θ = d .                      (8.6)
   Now,
                                  1 2
                       1     e− 2 c  1
                      √       √     ≤ γ1 (c, ∞) ≤ γ2 (E)            (using Lemma 8.1) .
                                  2
                       2π c + c + 4  2
Also,
                                                                  1          1 2
γ2 (E) = γ2 (H) − γ2 (K) = γ1 (−∞, d) − γ1 (−∞, g) = γ1 (g, d) ≤ √ (d − g)e− 2 g          (since d ≥ g ≥ 0) .
                                                                  2π
   Combining the above two, we get
                                              1   2   2
                                              e 2 (g −c )
                                                 √        ≤ d −g.                                       (8.7)
                                            c + c2 + 4
   From (8.5) and (8.7),
                                (q                                  1 2    2
                                                                              )
                                          2
                                                                  e 2 (g −c )
                        d ≥ max   max{0, c + 2 log sin θ }, g +      √          .
                                                                c + c2 + 4

   Putting the above in (8.6),
                                                                    1 (g2 −c2 )
                                                                                     
                         pmax{0, c2 + 2 log sin θ } − v sin θ g + e 2 √ 2 − v sin θ 
                                                                                    
                                                                   c+ c +4
               f ≥ max                                        ,                        .
                        
                                     cos θ                              cos θ       
                                                                                     

   Observe that
                             γ1 (−∞, g) = γ2 (K) ≤ γ1 (−c, ∞) = γ1 (−∞, c),

                       T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                              43
         DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

giving c ≥ g. Also θ ∈ [0, π/2]. Thus, the above lower bound on f holds if we minimize over all c ≥ g
and θ ∈ [0, π/2].
                                                                             1 (g2 −c2 )
                                                                                              
                                  pmax{0, c2 + 2 log sin θ } − v sin θ g + e 2 √ 2 − v sin θ 
                                                                                             
                                                                            c+ c +4
          f≥      min        max                                       ,                        .
              c≥g,θ ∈[0,π/2]     
                                              cos θ                              cos θ       
                                                                                              

    We will first minimize with respect to c. For this, we make the following observations:
   • For a fixed θ , the first term inside the maximum is a non-decreasing function of c while the second
     is a decreasing function of c.
              p
   • For c = g2 − 2 log sin θ ≥ g, the first term is smaller than the second term.
              p
   • For c = g2 − 2 log sin θ + 1 ≥ g, the first term is greater than the second term.
Thus, the two terms must become equal somewhere in the range
                                    p                  p
                               c ∈ [ g2 − 2 log sin θ , g2 − 2 log sin θ + 1] .
                                p
In particular, substituting c = g2 − 2 log sin θ + 1 in the second term provides a lower bound for f :
                                                                       sin θ √
                                            g+ √ √ 2                                           − v sin θ
                                                  e(    g −2 log sin θ +1+ g2 −2 log sin θ +5)
                      f≥         min
                               θ ∈[0,π/2]                               cos θ
                                                                                                            (8.8)
                                            g + √ √ 2 sin θ          − v sin θ
                                               2 e g −2 log sin θ +5
                           ≥     min                                                .
                               θ ∈[0,π/2]                   cos θ

    This expression goes to g as θ → 0 and to ∞ as θ → π/2. If it is increasing in this whole interval, we
are already done. Else, it achieves its minimum somewhere in (0, π/2). Let this be at θ ∗ . Setting the
derivative to zero, we get
                                                       sin 2θ ∗
                            f ≥ g cos θ ∗ − √ p                        3
                                            4 e g2 − 2 log sin θ ∗ + 5
                                                                                                            (8.9)
                                                                sin 2θ ∗
                              = g − 2g sin2 (θ ∗ /2) − √ p                       3
                                                                                   ,
                                                      4 e g2 − 2 log sin θ ∗ + 5
    where θ ∗ satisfies
                                                             1
                                     v = g sin θ ∗ + √ p                      .
                                                    2 e g − 2 log sin θ ∗ + 5
                                                         2

    From the two terms above, we can get two upper bounds on sin θ ∗ :

                                                       sin θ ∗ ≤ v/g ,
                                                                    2
                                                            ∗    eg /2+5/2                                 (8.10)
                                                       sin θ ≤            1     .
                                                                    e   8ev2



                          T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                               44
        T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

    Using these, we can simplify (8.9) as
                              f ≥ g − 2g sin2 (θ ∗ /2) − 2e sin 2θ ∗ (v − g sin θ ∗ )3
                               ≥ g − 2g sin2 (θ ∗ /2) − 2ev sin 2θ ∗                                     (8.11)
                                             2    ∗              ∗
                               ≥ g − 2g sin (θ ) − 4ev sin θ .
   We derive two bounds on the above expression, one which will be useful when g is small and other
when g is large. For the small g bound, using that v < v0 for v0 small enough,
                                                                        2                2
                              2    ∗              ∗  v       eg /2+5/2 eg /2
                        2g sin (θ ) + 4ev sin θ ≤ (2g + 4ev)       1  ≤ 1 .
                                                     g         e 8ev2  e 8ev2
    For the large g bound,
                                                           v2 4ev2           v2
                          2g sin2 (θ ∗ ) + 4ev sin θ ∗ ≤ 2g 2 +    = (4e + 2) .
                                                           g    g            g
    Thus,
                                                           v2
                                       2                     
                                        g
                                          − 1
                           f ≥ g − min e 2 8ev2 , (4e + 2)      = f ∗ , as needed.
                                                           g
    We now prove Theorem 7.1 in the special case where the barycenter lies to the right of the hyperplane
θ ⊥ . We show later how to reduce Theorem 7.1 to this case.
Lemma 8.3. There exists universal constants v0 , c0 > 0, such that for any n ≥ 1, v ∈ [0, v0 ] and θ ∈ Rn ,
kθ k2 = 1, convex body K ⊆ Rn satisfying γn (K) = α ≥ 1/2 and hb(K), θ i ≥ 0, we have that
                                                          − 12
                                                                !
                                                         e 100v
                                   γn−1 (Kvθ ) ≥ α 1 − √          .
                                                         4 2π
Proof. We split the proof into two steps. In step one, we reduce to a 2-dimensional problem and show
that it suffices to prove our theorem for a downwards closed convex body K 0 ⊆ R2 . This reduction
will guarantee that K 0 has barycenter on the y-axis and that the Gaussian measure of slices of K 0
parallel to the y-axis will correspond in the natural way to that of slices of K parallel to the hyperplane
θ ⊥ = {x ∈ Rn : hx, θ i = 0}. We then invoke Lemma 8.2 to get a lower bound on the height of K 0 at x = v.
Lastly, in step 2, we show that implies the required lower bound on the slice measure.
    Let g be s.t. γ1 (−∞, g) = α, i. e., g = Φ−1 (α). Note that g ≥ 0 since α ≥ 1/2.

Step 1: reduction to a 2-dimensional case. We will reduce our problem to one for a 2-dimensional
downwards closed convex body K 0 . To specify K 0 , we need only specify the height of the boundary at each
x-coordinate. At x-coordinate t, we define the height of K 0 to be the yt satisfying γ1 (−∞, yt ) = γn−1 (Ktθ ).
From Ehrhard’s inequality, we see that K 0 is in fact convex. Furthermore, it is easy to check that
γ2 (K 0 ) = γn (K) and b(K 0 )1 = hb(K), θ i ≥ 0.
     Thus, K 0 is a downwards closed convex body in R2 with γ2 (K 0 ) = α, b(K 0 )1 ≥ 0. From here, we may
invoke Lemma 8.2 to conclude that the height of K 0 at x = v is at least f ∗ := f (v, g). We now have that
                                   γn−1 (Ktθ ) = γ1 (−∞, yt ) ≥ γ1 (−∞, f ∗ ) .
From the above, it suffices to give a lower bound on γ1 (−∞, f ∗ ) in order to derive the theorem.

                       T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                 45
           DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

Step 2: Bounding γ1 (−∞, f ∗ ). Our goal is to show that
                                                                        1   !
                                                                    −
                                                          e 100v2
                                     γ1 (−∞, f ∗ ) ≥ α 1 − √                    .
                                                          4 2π

Clearly, it suffices to show
                                                                    1
                                                              −
                                                ∗        e 100v2
                                          γ1 ( f , g) ≤ α √ .
                                                         4 2π
Let
                                                                               v2
                                                                                       
                                                        g2     1
                                          ∗
                               εg = g − f = min e        2 − 8ev2   , (4e + 2)              .
                                                                               g
We split the analysis in two cases depending on whether g is small or big.

             1
Step 2a: g ≤ 5v .

                                           g2   1         1         1               1                   1
                                          −            −       −         −
                         ∗         εg  e 2 8ev2  e 50v2 8ev2  e 100v2   e 100v2
                    γ1 ( f , g) ≤ √ ≤ √         ≤ √          ≤ √      ≤α √ .
                                    2π    2π           2π     8 2π      4 2π

The penultimate inequality holds for an appropriate choice of v0 , and the last inequality uses α ≥ 1/2.

             1
Step 2b: g > 5v . Here we will use the other bound for εg .

                                       εg     ∗ 2     εg    2        2
                      γ1 ( f ∗ , g) ≤ √ e−( f ) /2 = √ e−g /2 egεg −εg /2
                                        2π             2π
                                       εg −g2 /2+gεg (4e + 2)v2 −g2 /2+(4e+2)v2
                                    ≤√ e            ≤     √    e                                            (8.12)
                                        2π               g 2π
                                                                                1                   1
                                                                            −                   −
                                 5(4e + 2)v3 − 1 2 +(4e+2)v2 e 100v2   e 100v2
                               ≤    √       e 50v           ≤ √      ≤α √ .
                                      2π                     8 2π      4 2π

The penultimate inequality holds for an appropriate choice of v0 , and the last inequality uses α ≥ 1/2.

      We now come to the proof of Theorem 7.1.
Theorem 7.1 (restated): There exist universal constants v0 , η0 , c0 > 0, such that for any n ≥ 1, convex
body K ⊆ Rn satisfying kb(K)k2 = η ≤ η0 and γn (K) = α ≥ 3/5, v ∈ [−v0 , v0 ] and θ ∈ Rn , kθ k2 = 1,
we have that
                                                               − 12
                                                                      !
                                                             e   100v
                              γn−1 (Kvθ ) ≥ (α − c0 η) 1 − √            .
                                                             4 2π



                       T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                   46
       T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

Proof. By rotational invariance, we may assume that θ = e1 , the first standard basis vector. By possibly
replacing K by −K, we may also assume that v ≥ 0.
     If b(K)1 ≥ 0, the desired lower bound follows directly from Lemma 8.3. Given this, we may assume
that −η ≤ b(K)1 < 0. To deal with this second case, the main idea is to remove some portion of K lying
to the left of the hyperplane e⊥          n
                               1 = {x ∈ R : x1 = 0} so that the barycenter of the remaining body lies on
e⊥
 1 . After this, we apply Lemma 8.3 again on the truncated body.
     Define                                                  1    2
                                                          e− 2 kxk2
                                                    Z
                                 b := b(K)1 γn (K) = x1 √ n dx < 0 .
                                                      K      2π
Let Ht− = {x ∈ Rn : x1 ≤ t} and Ht+ = {x ∈ Rn : x1 ≥ t} for t ∈ R. Let z < 0 be defined as the smallest
(in absolute value) negative number satisfying
                                                                 1       2
                                                            e− 2 kxk2
                                               Z
                                                         x1  √ n dx = 0 .                                    (8.13)
                                                   K∩Hz+       2π
By continuity, such a z must exists, since as z → −∞ the left hand side tends to b < 0 and at z = 0 it is
positive. Given the above, note that b(K ∩ Hz+ )1 = 0. We will now show that γn (K ∩ Hz+ ) ≥ α − c0 η if
η ≤ η0 , for c0 , η0 appropriately chosen constants.
   By our choice of z, we have the equality
                                                                 1       2
                                                           e− 2 kxk2
                                               Z
                                                         x1 √     n dx = b .
                                                   K∩Hz−      2π
From here, we see that
                                                     1       2                       1       2
                                             e− 2 kxk2                    x1 e− 2 kxk2 b b
                                      Z                              Z
                    γn (K ∩ Hz− ) =           √ n dx ≤                  − z
                                                                              √ n dx = =   .
                                       K∩Hz−    2π                   K∩Hz       2π     z z
Given this, we get that
                                                                                                 b     η
             γn (K ∩ Hz+ ) = γn (K) − γn (K ∩ Hz− ) = α − γn (K ∩ Hz− ) ≥ α −                      ≥α−   .
                                                                                                 z     z
   We now show that there exists a constant c0 s.t. 1/|z| ≤ c0 . Let β = γn (K ∩ H0+ ), and note that

                          β = γn (K) − γn (K ∩ H0− ) ≥ α − 1/2 ≥ 3/5 − 1/2 = 1/10 .

Let τ > 0 be positive number satisfying γ1 (0, τ) = β , i. e., τ = Φ−1 (1/2 + β ). By pushing the mass of
K ∩ H0+ to the left towards e⊥
                             1 as much as possible, we see that
                               1   2                                         1   2
                            e− 2 kxk2                            e− 2 kxk2  1
                Z                              Z
                                                                                     2
                        x 1  √ n dx ≥                          x1 √ n dx = √ (1 − e−τ /2 ) .                 (8.14)
                  K∩H0+        2π              {x∈Rn :0≤x1 ≤τ}      2π      2π
Next, by inclusion
                                       1   2                                     1       2
                                e− 2 kxk2                               e− 2 kxk2  1
            Z                                            Z
                                                                                        2
                              x1 √ n dx ≥                             x1 √ n dx = √ (e−z /2 − 1) .           (8.15)
              K∩{x∈Rn :z≤x≤0}      2π                        n
                                                         {x∈R :z≤x≤0}      2π      2π

                        T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                   47
         DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

Given that z satisfies (8.13), combining Equations (8.14), (8.15), we must have that
                                      2        2
                            0 ≥ e−z /2 − e−τ /2 ⇒ |z| ≥ τ ≥ Φ−1 (6/10) > 0 .

Thus, we may set c0 = 1/Φ−1 (6/10). Set η0 = 1/(10c0 ). Since η ≤ η0 , we have that

                              γn (K ∩ Hz+ ) ≥ α − c0 η ≥ 3/5 − 1/10 = 1/2 .

Lastly, using Lemma 8.3 on K ∩ Hz+ , we now get that
                                                                                                    1   !
                                                                                              −
                                                                                      e 100v2
                      γn−1 (Kve1 ) = γn−1 ((K ∩ Hz+ )ev1 ) ≥ (α − c0 η)             1− √                    ,
                                                                                      4 2π

as needed.


9     Basic estimates
In this section we prove basic estimates that we omitted earlier.

9.1   Proof of Lemma 2.3
Proof. We first prove subgaussianity. Let θ ∈ Sn−1 , t ≥ 0. By assumption on X,
                                                                                                            2 2
                                                                                                         eσ λ /2
                Pr[|hX, θ i| ≥ t] = min Pr[cosh(λ hX, θ i) ≥ cosh(λt)] ≤ min β ·
                                      λ >0                                                   λ >0       cosh(λt)
                                               σ 2 λ 2 /2−λt              − 12 (t/σ )2
                                ≤ min 2β · e                   ≤ 2β · e                  ,
                                      λ >0

where the lastpinequality follows by setting λ = t/σ 2 .
     Let α = log2 β + 1. To prove that X is ασ -subgaussian, since probabilities are always at most one,
it suffices to prove that        n                     o
                                             1       2      1          2
                              min 1, 2β · e− 2 (t/σ ) ≤ 2e− 2 (t/(ασ )) , ∀t ≥ 0.
                 √
Replacing t ← 2σt, the above simplifies to showing
                                      n              2
                                                       o          2
                                 min 1, 2β · e−t ≤ 2e−(t/α) , ∀t ≥ 0.                              (9.1)

From here, we see that
                                                                     s
                      2           2                    2   2                         α2
                β · e−t ≤ e−(t/α) ⇔ β ≤ e(1−1/α )t ⇔ t ≥
                                                                                             p
                                                                           ln β ·          =  ln(2β ).             (9.2)
                                                                                    α2 − 1
                                           2           2
Let r = ln(2β ), noting that 1 = 2β · e−r = 2e−(r/α) , we have that for t ≤ r, the LHS of (9.1) is 1 and
       p
                                                              2
the RHS is at least 1, for t > r, the LHS is equal to 2β · e−t and the RHS is larger by (9.2). Thus, X is
ασ -subgaussian as needed.

                      T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                          48
         T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

    We now prove the furthermore. For X an n-dimensional standard Gaussian, note that hX, wi is
distributed like σY , where Y ∼ N(0, 1) and σ = kwk2 . Hence,
                                                   1    ∞ Z
                                                                   2
                           E[ehX,wi ] = E[eσY ] = √       eσ x e−x /2 dx
                                                   2π −∞
                                                                      
                                                 1
                                                    Z ∞
                                          2
                                         σ /2            −(x−σ )2 /2        2
                                      =e        √       e            dx = eσ /2 .
                                                 2π −∞

9.2     Estimating the barycenter
In this section we show how to efficiently estimate the barycenter of K up to a small accuracy in `2 -norm.
For a convex body K ⊆ Rn , we let γK denote the Gaussian measure restricted to K. For a random variable
X in Rn , we denote the covariance of X by cov[X] = E[(X − E[X])(X − E[X])T ].
    The following lemma shows that the covariance of a Gaussian random vector shrinks when restricted
to a convex body. We include a short proof for completeness.
Lemma 9.1. Given a convex body K in Rn , let γK be the Gaussian distribution restricted to K, and let X
be a random variable distributed according to γK . Then, cov[X]  In .
Proof. Consider f (t) = ln γn (K + t). f is concave in t. This follows from log-concavity of γn , an easy
consequence of the Prékopa-Leindler inequality. Hence, the Hessian of f , H( f ), is negative semi-definite.
It can be calculated that
                               H( f ) = H(ln γn (K + t)) = cov[X + t] − In ,
where X ∼ γK . Setting t = 0 completes the proof.

      We will also need to use Paouris’ inequality [29], which we restate slightly:
Theorem 9.2. If X ⊆ Rn is a log-concave random vector with mean 0 and positive-definite covariance
matrix C, then for every t ≥ 1,     p              √         √
                                 Pr[ X TC−1 X ≥ βt n] ≤ e−t n
where β > 0 is an absolute constant.
Theorem 9.3. Let K be a convex body in Rn , given by a membership oracle, with γn (K) ≥ 1/2. For any
δ > 0 and ε ∈ (0, 1), there is an algorithm which computes the barycenter of K within accuracy δ in
`2 -norm with probability at least 1 − ε in time, polynomial in n, 1/δ and log(1/ε).
Proof. Let b be the barycenter of K and Xi for 1 ≤ i ≤ N be i.i.d generated from γK , where N =
d(β /δ )2 log2 (e/ε)ne. Here β is the constant from Theorem 9.2. Defining the following quantities
                                               1 N
                                      b0 =       ∑ Xi ,
                                               N i=1
                                      Yi = Xi − b,
                                               1 N
                                      Y   =      ∑ Yi ,
                                               N i=1
                                      C =       E [(X − b)(X − b)T ],
                                               X∼γK


                        T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                            49
          DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

we can see that b0 is an estimate of the barycenter, generated by averaging random samples from the
distribution γK and Y is the difference vector between the true barycenter and b0 . Thus it suffices to
bound the probability that Y is large and then show how to efficiently generate random samples from the
distribution γK . It holds that
                                      E[Yi ] = E[Xi − b] = b − b = 0 .
Also, using Lemma 9.1,
                                       E[YiYiT ] = cov[Xi ] = C  In .
Thus,
                                   E[Y ] = 0 and cov[Y ] = C/N  In /N .
Since γK is a log-concave distribution, Xi and hence Yi are log-concave random vectors. It is easily
checked (using the Prékopa-Leindler inequality) that the average of log-concave random variables is also
log-concave and hence Y is a log-concave random vector. Now,
                                   s                          
                                            −1
                                             In            √
             Pr[kY k2 ≥ δ ] = Pr  Y T              Y ≥ δ N
                                             N
                                   s                          
                                            −1
                                             C             √
                            ≤ Pr  Y T              Y ≥ δ N          (using C/N  In /N).
                                             N

Putting N = d(β /δ )2 log2 (e/ε)ne and using Theorem 9.2 with t = log(e/ε), we get
                                                        √
                                Pr[kY k2 ≥ δ ] ≤ e− n log(e/ε) ≤ ε/e ≤ ε/2.                             (9.3)

      We can generate the random points Xi using rejection sampling. For each i, we generate a sequence
                                               (1)          (k)
of i.i.d. standard Gaussian random variables Xi , . . . , Xi ∈ Rn , k = dlog2 (2N/ε)e. We set Xi to the first
  ( j)                                               ( j)
Xi in the sequence that belongs to K; if no such Xi exists, we set Xi arbitrarily. Clearly, conditional on
                                      ( j)
the existence of a j ≤ k such that Xi ∈ K, Xi ∼ γK . Furthermore, because K has Gaussian measure at
                                    ( j)
least 1/2, for every j we have Pr[Xi 6∈ K] ≤ 1/2, so
                                               ( j)
                                     Pr[∀ j : Xi      6∈ K] ≤ 2−k ≤ ε/2N.

By a union bound, with probability at least 1 − ε/2, all Xi are distributed according to γK ; let us call this
event E. Conditional on E, inequality (9.3) holds, and

              Pr[kb − b0 k2 ≥ δ ] = Pr[kY k2 ≥ δ ]
                                 = Pr[kY k2 ≥ δ | E] · Pr[E] + Pr[kY k2 ≥ δ | E c ](1 − Pr[E])
                                 ≤ ε/2 + ε/2 = ε.

The algorithm needs to generate O(N log(N/ε)) d-dimensional Gaussian random variables, check mem-
bership in K for each of them, and compute the average of N points. Since each of these operations
takes polynomial time, and N is polynomial in n, δ , and log(1/ε), the running time of the algorithm is
polynomial.

                       T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                               50
         T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

9.3    Geometric estimates from Section 6
In this section, we present the required estimates for the proof of Lemma 3.2.
    The following theorem of Latała and Oleszkiewicz will allow us to translate bounds on Gaussian
measure to bounds on Gaussian norm expectations.

Theorem 9.4 ([20]). Let X ∈ Rn be a standard n-dimensional Gaussian. Let K ⊆ Rn be a symmetric
convex body, and let α ≥ 0 be chosen such that Pr[X ∈ K] = Pr[|X1 | ≤ α]. Then the following hold.

   1. For t ∈ [0, 1], Pr[X ∈ tK] ≤ Pr[|X1 | ≤ tα].

   2. For t ≥ 1, Pr[X ∈ tK] ≥ Pr[[X1 | ≤ tα].

   Using the theorem above we derive bound on Gaussian norm expectations. We note that much weaker
and more elementary estimates than those given in Theorem 9.4 would suffice (e. g., Borell’s inequality),
however we use the stronger theorem to achieve a better constant.

Lemma 9.5. Let X ∈ Rn be a standard n-dimensional Gaussian. Let K ⊆ Rn be a symmetric convex
body such that Pr[X ∈ K] ≥ 1/2. Then E[kXkK ] ≤ 1.5.

Proof. Let m > 0 satisfy Pr[X ∈ mK] = Pr[kXkK ≤ m] = 1/2. Note that m ≤ 1 by our assumption on K.
Let α > 0 denote the number such that Pr[|X1 | ≤ α] = γ1 ([−α, α]) = 1/2. Here a numerical calculation
reveals α ≥ .67.
    By Theorem 9.4, we have that Pr[kXkK ≥ tm] ≤ Pr[|X1 | ≥ tα] for t ≥ 1. Thus,
                   Z ∞                       Z ∞                          Z ∞    h        αt i
      E[kXkK ] =     Pr[kXkK ≥ t]dt ≤ m +        Pr[kXkK ≥ t]dt ≤ m +          Pr |X1 | ≥      dt
                                                                                          m
                 0                         m                              m
                       1                                1
                         Z ∞                               Z ∞
               = 1+          Pr[|X1 | ≥ t]dt m ≤ 1 +           Pr[|X1 | ≥ t]dt
                       α α                              α α
                                                              r                   !           r        2
                     1 ∞ 2                                1      2                                2 e−α /2
                       Z
                                            2
                                          −t /2                         2
                                                                    −α /2
               = 1+        √ (t − α)e           dt = 1 +           e       − α/2 = 1/2 +
                     α α     2π                          α       π                                π α
                       r           2
                          2 e−(.67) /2
               ≤ 1/2 +                 ≤ 1.5 .
                          π .67
   The following lemma shows that we can find a large ball in K centered around the origin, if either its
Gaussian mass is large or its barycenter is close to the origin.

Lemma 9.6. Let K ⊆ Rn be a convex body. Then the following hold.

   1. If for r ≥ 0, γ1 ((−∞, r]) ≤ γn (K), then rBn2 ⊆ K.              √
      In particular, if γn (K) = 1/2 + ε, for ε ≥ 0, this holds for r = 2πε.
                                                                                2
   2. If for 0 ≤ r ≤ 1/2, γ1 ([−2r, 2r]) ≤ γn (K) and kb(K)k2 γn (K) ≤ √12π · r2 , then rBn2 ⊆ K. In particular,
        this holds for r = 1/4 if γn (K) ≥ 1/2 and kb(K)k2 ≤ 32√1 2π .


                         T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                               51
          DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

Proof. We begin with part (1). Assume for the sake of contradiction that there exist x ∈ Rn , kxk2 ≤ r, such
that x ∈/ K. Then, by the separator theorem there exists a unit vector θ ∈ Sn−1 , such that maxz∈K hθ , zi <
hθ , xi. In particular, K is strictly contained in the halfspace H = {z ∈ Rn : hθ , zi ≤ g} where g = hθ , xi.
Thus γn (K) < γn (H) = γ1 ((−∞, g]). But note that by Cauchy-Schwarz g ≤ kθ k2 kxk2 = r, a clear
contradiction to the assumption on r.
      For the furthermore, we first see that
                                           Z √2πε                √
                                       1             −x2 /2     ( 2πε)
                                     √             e        dx ≤ √      =ε.
                                       2π 0                        2π
                   √
Hence, γ1 ((−∞, 2πε]) ≤ 1/2 + ε = γn (K), as needed.
      We now prove part (2). Similarly to the above, if K does not contain a ball of radius r, then there
exists a halfspace H = {z ∈ Rn : hθ , zi ≤ r − ε}, for some 0 < ε ≤ r, such that K ⊆ H. By rotational
invariance of the Gaussian measure, we may assume that θ = e1 . Now let Kt = {x ∈ K : x1 = t} and let
 f (t) = γn−1 (Kt − te1 ), where clearly f (t) ∈ [0, 1]. From here, we see that
                                2                                        2
                          e−t /2                                    e−t /2 γn−1 (Kt − te1 )
                     Z r−ε                                     Z r−ε
                                        f (t)
                         t√      R r−ε e−t 2 /2         dt =       t√                       dt
                      −∞    2π          √       f (t)dt         −∞    2π       γn (K)                    (9.4)
                                     −∞       2π
                                                            = b(K)1 ≥ −kb(K)k2 .
Thus to get a contradiction, it suffices to show that
                                                   2
                                         e−t /2
                                    Z r−ε
                                        t√      f (t)dt < −γn (K)kb(K)k2 ,                               (9.5)
                                     −∞    2π
for any function f : (−∞, r − ε] → [0, 1] satisfying
                                            Z r−ε −t 2 /2
                                                 e
                                                 √    f (t)dt = γn (K).                                  (9.6)
                                             −∞    2π
From here, it is easy to see that the function f maximizing the left hand side of (9.5) satisfying (9.6) must
be the indicator function of an interval with right end point r − ε, i. e., the function f which pushes mass
“as far to the right” as possible. Now let l ≤ r −ε denote the unique number such that γ1 ([l, r −ε]) = γn (K),
noting that the optimizing f is now the indicator function of [l, r − ε]. From here, a direct computation
reveals that
                    Z r−ε −t 2 /2
                           e            1        2            2         1       2       2
                          t √ dt = √ (e−l /2 − e−(r−ε) /2 ) < √ (e−l /2 − e−r /2 ) .
                     l        2π         2π                             2π
Now since γ1 ([l, r]) > γn (K) ≥ γ1 ([−2r, 2r]), we must have that l ≤ −2r. Using the inequalities 1 + x ≤
ex ≤ 1 + x + x2 , for |x| ≤ 1/2, we have that
            1    2        2        1       2         2
           √ (e−l /2 − e−r /2 ) ≤ √ (e−(2r) /2 − e−r /2 )
            2π                     2π
                                   1
                                ≤ √ (1 − (2r)2 /2 + (2r)4 /4 − (1 − r2 /2))
                                   2π
                                   1                          1
                                = √ ((4r2 )r2 − 3r2 /2) ≤ − √ · r2 /2 (since r ≤ 1/2) .
                                   2π                         2π

                       T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                52
        T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

But by assumption
                                         1
                                      − √ · r2 /2 ≤ −γn (K)kb(K)k2 ,
                                         2π
yielding the desired contradiction.
    For the furthermore, it follows by a direct numerical computation.

    We will now extend the bound to asymmetric convex bodies having their barycenter near the origin.
To do this, we will need the standard fact that the gauge function of a body is Lipschitz when it contains a
large ball. We recall that a function f : Rn → R is L-Lipschitz if for x, y ∈ Rn , | f (x) − f (y)| ≤ Lkx − yk2 .

Lemma 9.7. Let K ⊆ Rn be a convex body satisfying rBn2 ⊆ K for some r > 0. Then, the gauge function
k · kK : Rn → R+ of K is (1/r)-Lipschitz.

Proof. We need to show

                                              1
                              |kxkK − kykK | ≤ kx − yk2           for all x, y ∈ Rn .
                                              r
To see this, note that

                          kxkK = k(x − y) + ykK
                                 ≤ kx − ykK + kykK              (by triangle inequality)
                                 ≤ kx − ykrBn2 + kykK            (since rBn2 ⊆ K )
                                   1
                                 =   kx − yk2 + kykK ,
                                   r

yielding kxkK − kykK ≤ 1r kx − yk2 . The other inequality follows by switching x and y.

    We will also need the following concentration inequality of Maurey and Pisier.

Theorem 9.8 (Maurey-Pisier). Let f : Rn → R be an L-Lipschitz function. Then for X ∈ Rn standard
Gaussian and t ≥ 0, we have the inequalities
                                                     2                                                2
                                               − 2t2                                          − 2t2
             Pr[ f (X) − E[ f (X)] ≥ tL] ≤ e     π       and   Pr[ f (X) − E[ f (X)] ≤ −tL] ≤ e   π       .

    We now prove the main estimate for asymmetric convex bodies.

Lemma 9.9. Let K ⊆ Rn be a 0-centered convex body and X ∈ Rn be the standard n-dimensional
Gaussian. Then the following hold.

   1. E[kXkK∩−K ] ≤ 2 E[kXkK ].
                                                                    √
   2. If γn (K) ≥ 1/2 and kb(K)k2 ≤ 32√1 2π , then E[kXkK ] ≤ (1 + π 8 ln 2).


                         T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                                53
         DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

Proof. We prove part (1). By symmetry of the Gaussian measure

             E[kXkK∩−K ] = E[max {kXkK , k − XkK }] ≤ E[kXkK + k − XkK ] = 2 E[kXkK ] ,

as needed.                      √
    We prove part (2). Let c = π 8 ln 2. First, by Lemma 9.6 part 2 and our assumptions on K, we have
that (1/4)Bn2 ⊆ K. Thus, by Lemma 9.7 k · kK is 4-Lipschitz. Assume for the sake of contradiction that
E[kXkK ] > 1 + c. Then, since

                                1/2 ≤ γn (K) = Pr[X ∈ K] = Pr[kXkK ≤ 1] ,

we must have that Pr[kXkK − E[kXkK ] ≤ −c] > 1/2. But by Theorem 9.8 and the Lipschitz properties of
k · kK ,
                                                                      2
                                                              − 2(c/4)
                               Pr[kXkK − E[kXkK ] ≤ −c] ≤ e       π 2
                                                                          = 1/2 ,
a clear contradiction.



References
 [1] M ILTON A BRAMOWITZ AND I RENE A. S TEGUN , EDS .: Handbook of Mathematical Functions
     With Formulas, Graphs, and Mathematical Tables. Dover Publ., 1992. Reprint of the 1972 edition
     [LINK to original]. 40

 [2] W OJCIECH BANASZCZYK: Balancing vectors and Gaussian measures of n-dimensional con-
     vex bodies. Random Structures Algorithms, 12(4):351–360, 1998. [doi:10.1002/(sici)1098-
     2418(199807)12:4<351::aid-rsa3>3.0.co;2-s] 2, 12, 19

 [3] W OJCIECH BANASZCZYK: On series of signed vectors and their rearrangements. Random Struc-
     tures Algorithms, 40(3):301–316, 2012. [doi:10.1002/rsa.20373] 3, 12

 [4] N IKHIL BANSAL: Constructive algorithms for discrepancy minimization. In Proc. 51st FOCS, pp.
     3–10. IEEE Comp. Soc. Press, 2010. [doi:10.1109/FOCS.2010.7, arXiv:1002.2259] 2, 7

 [5] N IKHIL BANSAL , DANIEL DADUSH , AND S HASHWAT G ARG: An algorithm for Komlós conjecture
     matching Banaszczyk’s bound. SIAM J. Comput., 48(2):534–553, 2019. Preliminary version in
     FOCS’16. [doi:10.1137/17M1126795] 2, 3, 4, 8

 [6] N IKHIL BANSAL , DANIEL DADUSH , S HASHWAT G ARG , AND S HACHAR L OVETT: The Gram-
     Schmidt walk: a cure for the Banaszczyk blues. In Proc. 50th STOC, pp. 587–597. ACM Press,
     2018. [doi:10.1145/3188745.3188850] 12

 [7] F RANCK BARTHE , O LIVIER G UÉDON , S HAHAR M ENDELSON , AND A SSAF NAOR: A
     probabilistic approach to the geometry of the `np -ball. Ann. Probab., 33(2):480–513, 2005.
     [doi:10.1214/009117904000000874, arXiv:math/0503650] 3

                         T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                     54
       T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

 [8] J ÓZSEF B ECK: Roth’s estimate of the discrepancy of integer sequences is nearly sharp. Combina-
     torica, 1(4):319–325, 1981. [doi:10.1007/BF02579452] 2

 [9] J ÓZSEF B ECK AND T IBOR F IALA: “Integer-making” theorems. Discr. Appl. Math., 3(1):1–8, 1981.
     [doi:10.1016/0166-218X(81)90022-6] 2

[10] C HRISTER B ORELL: The Ehrhard inequality. C. R. Math. Acad. Sci. Paris, 337(10):663–666, 2003.
     [doi:10.1016/j.crma.2003.09.031] 15

[11] B ORIS B UKH: An improvement of the Beck-Fiala theorem. Combin. Probab. Comput., 25(3):380–
     398, 2016. [doi:10.1017/S0963548315000140, arXiv:1306.6081] 2

[12] B ERNARD C HAZELLE: The Discrepancy Method: Randomness and Complexity. Cambridge Univ.
     Press, 2000. [doi:10.1017/CBO9780511626371] 2

[13] W ILLIAM C HEN , A NAND S RIVASTAV, G IANCARLO T RAVAGLINI , ET AL .: A Panorama of
     Discrepancy Theory. Springer, 2014. [doi:10.1007/978-3-319-04696-9] 2

[14] DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV:
     Towards a constructive version of Banaszczyk’s vector balancing theorem. In Proc. 20th Internat.
     Workshop on Randomization and Computation (RANDOM’16), volume 60 of LIPIcs, pp. 28:1–
     28:12. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2016. [doi:10.4230/LIPIcs.APPROX-
     RANDOM.2016.28] 2, 12

[15] A NTOINE E HRHARD: Symétrisation dans l’espace de Gauss. Math. Scand., 53(2):281–301, 1983.
     [doi:10.7146/math.scand.a-12035] 15

[16] RONEN E LDAN AND M OHIT S INGH: Efficient algorithms for discrepancy minimization in
     convex sets. Random Structures Algorithms, 53(2):289–307, 2018. [doi:10.1002/rsa.20763,
     arXiv:1409.2913] 2, 4, 7, 10

[17] E STHER E ZRA AND S HACHAR L OVETT: On the Beck-Fiala conjecture for random set systems.
     Random Structures Algorithms, 54(4):665–675, 2019. Preliminary version in RANDOM’16.
     [doi:10.1002/rsa.20810, arXiv:1511.00583] 2

[18] DAVID A. F REEDMAN: On tail probabilities for martingales. Ann. Probab., 3(1):100–118, 1975.
     [doi:10.1214/aop/1176996452] 24

[19] N ICHOLAS J. A. H ARVEY, ROY S CHWARTZ , AND M OHIT S INGH: Discrepancy without partial
     colorings. In Proc. 17th Internat. Workshop on Approximation Algorithms for Combinatorial
     Optimization Problems (APPROX’14), volume 28 of LIPIcs, pp. 258–273. Schloss Dagstuhl–
     Leibniz-Zentrum fuer Informatik, 2014. [doi:10.4230/LIPIcs.APPROX-RANDOM.2014.258] 2, 7,
     8

[20] R AFAŁ L ATAŁA AND K RZYSZTOF O LESZKIEWICZ: Gaussian measures of dilatations of convex
     symmetric sets. Ann. Probab., 27(4):1922–1938, 1999. [doi:10.1214/aop/1022874821] 51

                     T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                        55
         DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

[21] AVI L EVY, H ARISHCHANDRA R AMADAS , AND T HOMAS ROTHVOSS: Deterministic discrep-
     ancy minimization via the multiplicative weight update method. In Proc. 19th Ann. Conf. on
     Integer Programming and Combinatorial Optimization (IPCO ’17), pp. 380–391. Springer, 2017.
     [doi:10.1007/978-3-319-59250-3_31, arXiv:1611.08752] 2
[22] L ÁSZLÓ L OVÁSZ , J OEL H. S PENCER , AND K ATALIN V ESZTERGOMBI: Discrepancy of
     set-systems and matrices. European J. Combin., 7(2):151–160, 1986. [doi:10.1016/S0195-
     6698(86)80041-5] 5
[23] S HACHAR L OVETT AND R AGHU M EKA: Constructive discrepancy minimization by walking
     on the edges. SIAM J. Comput., 44(5):1573–1582, 2015. Preliminary version in FOCS’12.
     [doi:https://doi.org/10.1137/130929400, arXiv:1203.5747] 2, 7, 8
[24] J I ŘÍ M ATOUŠEK: Geometric Discrepancy: An Illustrated Guide. Springer, 1999. [doi:10.1007/978-
     3-642-03942-3] 2, 5
[25] J OHN VON N EUMANN: Zur Theorie der Gesellschaftsspiele. Mathematische Annalen, 100(1):295–
     320, 1928. [doi:10.1007/BF01448847] 20
[26] A LEKSANDAR N IKOLOV:          The Komlós conjecture holds for vector colorings, 2013.
     [arXiv:1301.4039] 7, 22, 26
[27] A LEKSANDAR N IKOLOV: Tighter bounds for the discrepancy of boxes and polytopes. Mathematika,
     63(3):1091–1113, 2017. [doi:10.1112/S0025579317000250, arXiv:1701.05532] 12
[28] A LEKSANDAR N IKOLOV AND K UNAL TALWAR: Approximating hereditary discrepancy via small
     width ellipsoids. In Proc. 26th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’15), pp.
     324–336. ACM Press, 2015. [doi:10.1137/1.9781611973730.24, arXiv:1311.6204] 3
[29] G RIGORIS PAOURIS: Concentration of mass on convex bodies. Geom. Funct. Anal., 16(5):1021–
     1049, 2006. [doi:10.1007/s00039-006-0584-5] 49
[30] T HOMAS ROTHVOSS: Constructive discrepancy minimization for convex sets. SIAM J. Comput.,
     46(1):224–234, 2017. Preliminary version in FOCS’14. [doi:10.1137/141000282, arXiv:1404.0339]
     2, 4, 7
[31] J OEL S PENCER: Six standard deviations suffice. Trans. Amer. Math. Soc., 289(2):679–706, 1985.
     [doi:10.1090/S0002-9947-1985-0784009-0] 7
[32] J OEL S PENCER: Ten Lectures on the Probabilistic Method.           Volume 52.     SIAM, 1987.
     [doi:10.1137/1.9781611970074] 3, 5
[33] A RAVIND S RINIVASAN: Improving the discrepancy bound for sparse matrices: better approxima-
     tions for sparse lattice approximation problems. In Proc. 8th Ann. ACM-SIAM Symp. on Discrete
     Algorithms (SODA’97), pp. 692–701. ACM Press, 1997. ACM DL. 2
[34] M ICHEL TALAGRAND: The Generic Chaining: Upper and Lower Bounds of Stochastic Processes.
     Springer, 2005. [doi:10.1007/3-540-27499-5] 18


                     T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                         56
    T OWARDS A C ONSTRUCTIVE V ERSION OF BANASZCZYK ’ S V ECTOR BALANCING T HEOREM

AUTHORS

   Daniel Dadush
   Researcher
   Centrum Wiskunde & Informatica
   Amsterdam, the Netherlands
   dadush cwi nl
   https://homepages.cwi.nl/~dadush/


   Shashwat Garg
   Quantitative researcher
   WorldQuant
   Budapest, Hungary
   garg shashwat gmail com


   Shachar Lovett
   Associate professor
   Department of Computer Science and Engineering
   University of California, San Diego
   San Diego, CA, USA
   slovett cs ucsd edu
   https://cseweb.ucsd.edu/~slovett/


   Aleksandar Nikolov
   Assistant professor
   Department of Computer Science
   University of Toronto
   Toronto, Ontario, Canada
   anikolov cs toronto edu
   http://www.cs.toronto.edu/~anikolov/




                 T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58               57
       DANIEL DADUSH , S HASHWAT G ARG , S HACHAR L OVETT, AND A LEKSANDAR N IKOLOV

ABOUT THE AUTHORS

    DANIEL DADUSH is a tenured researcher at the Centrum Wiskunde & Informatica (CWI)
      in Amsterdam, Netherlands. He earned his Ph. D. in algorithms, combinatorics and
      optimization (ACO) from Georgia Tech in 2012, where his advisor was Santosh Vempala.
      Before joining CWI, he spent two years as a Simons postdoctoral fellow in the Computer
      Science Department at New York University. His research has focused on algorithms for
      lattice problems, integer programming, and high-dimensional convex geometry.
       He lives and works in Amsterdam, and is glad that, thus far, the city remains above water.
       He enjoys reading the New York Times, listening to NPR, and taking long bike rides
       along the canals when it is not raining.


    S HASHWAT G ARG is a Quantitative Researcher at WorldQuant LLC. He got his Ph. D. in
       algorithms from Eindhoven University of Technology in 2019 where his advisor was
       Nikhil Bansal. His research focused on algorithms for combinatorial discrepancy and
       approximation algorithms for scheduling problems. He lives in Budapest, Hungary. He
       enjoys reading anything under the sun, exploring new cafés and painting.


    S HACHAR L OVETT is an associate professor at the University of California, San Diego. He
       received his Ph. D. from the Weizmann Institute of Science in Israel, where his advisors
       were Omer Reingold and Ran Raz. Before joining UC San Diego, he was a member
       of the Institute for Advanced Study in Princeton. His research is broadly in theoretical
       computer science and related mathematics, with a focus on computational complexity,
       randomness and pseudo-randomness, algebraic constructions, coding theory, additive
       combinatorics and high-dimensional geometry. Other than doing research, Shachar has a
       wife, three kids, one dog and three chickens, so he never has time to be bored.


    A LEKSANDAR N IKOLOV is an assistant professor in the Department of Computer Science
       at the University of Toronto. He received his Ph. D. in 2014 from Rutgers University,
       where his supervisor was S. Muthukrishnan, and did a postdoc with the Theory Group
       at Microsoft Research in Redmond. He is a Canada Research Chair in Algorithms and
       Privacy. He is interested in discrepancy theory, differential privacy, convex geometry and
       geometric algorithms.
       Sasho, as he is known to his friends, was born in Varna, Bulgaria. He enjoys stand-up
       comedy, watching obscure artsy movies, and reading a book or The New Yorker with a
       cup of coffee.




                   T HEORY OF C OMPUTING, Volume 15 (15), 2019, pp. 1–58                            58