Efficient Rounding for the Noncommutative Grothendieck Inequality

Authors Assaf Naor, Oded Regev, Thomas Vidick,
Plaintext
                        T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295
                                       www.theoryofcomputing.org




          Efficient Rounding for the
    Noncommutative Grothendieck Inequality
                     Assaf Naor∗                  Oded Regev†               Thomas Vidick‡
                Received February 16, 2013; Revised August 11, 2014; Published October 2, 2014




       Abstract: The classical Grothendieck inequality has applications to the design of ap-
       proximation algorithms for NP-hard optimization problems. We show that an algorithmic
       interpretation may also be given for a noncommutative generalization of the Grothendieck
       inequality due to Pisier and Haagerup. Our main result, an efficient rounding procedure for
       this inequality, leads to a polynomial-time constant-factor approximation algorithm for an
       optimization problem which generalizes the Cut Norm problem of Frieze and Kannan, and is
       shown here to have additional applications to robust principal component analysis and the
       orthogonal Procrustes problem.

ACM Classification: G.1.6
AMS Classification: 68W25
Key words and phrases: approximation algorithms, Grothendieck inequality, semidefinite programming,
principal component analysis




      An extended abstract of this paper appeared in the Proceedings of the 45th Annual ACM Symposium on Theory of
Computing, 2013 [34].
    ∗ Supported by NSF grant CCF-0832795, BSF grant 2010021, the Packard Foundation and the Simons Foundation. Part of

this work was completed while A. N. was visiting Université de Paris Est Marne-la-Vallée.
    † Supported by a European Research Council (ERC) Starting Grant. Part of the work done while the author was with the

CNRS, DI, ENS, Paris.
    ‡ Partially supported by the National Science Foundation under Grant No. 0844626 and by the Ministry of Education,

Singapore under the Tier 3 grant MOE2012-T3-1-009.


© 2014 Assaf Naor, Oded Regev, and Thomas Vidick
c b Licensed under a Creative Commons Attribution License (CC-BY)                          DOI: 10.4086/toc.2014.v010a011
                           A SSAF NAOR , O DED R EGEV, AND T HOMAS V IDICK

Contents
1   Introduction                                                                                    259
    1.1 Applications of Theorem 1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
         1.1.1 The Grothendieck problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
         1.1.2 Robust PCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
         1.1.3 The orthogonal Procrustes problem . . . . . . . . . . . . . . . . . . . . . . . . 261
         1.1.4 A Frieze-Kannan decomposition for 4-tensors . . . . . . . . . . . . . . . . . . . 262
         1.1.5 Quantum XOR games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
    1.2 The noncommutative Grothendieck inequality . . . . . . . . . . . . . . . . . . . . . . . 264
         1.2.1 The complex case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
         1.2.2 The rounding algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
         1.2.3 An intuitive description of the rounding procedure in the commutative case . . . 268

2   Proof of Theorem 1.4                                                                             269
    2.1 Analysis of the rounding procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
    2.2 Derandomized rounding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
    2.3 The rounding procedure in the case of (1.18) . . . . . . . . . . . . . . . . . . . . . . . . 273

3   The real and Hermitian cases                                                                     275
    3.1 Two-dimensional rounding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
    3.2 Rounding in the Hermitian case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
    3.3 Proof of Theorem 3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279

4   Direct rounding in the real case                                                                282

5   Some applications                                                                             286
    5.1 Constant-factor algorithm for robust PCA problems . . . . . . . . . . . . . . . . . . . . 288
    5.2 A constant-factor algorithm for the orthogonal Procrustes problem . . . . . . . . . . . . 288
    5.3 An algorithmic noncommutative dense regularity lemma . . . . . . . . . . . . . . . . . 290

References                                                                                          291




                   T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                         258
             E FFICIENT ROUNDING FOR THE N ONCOMMUTATIVE G ROTHENDIECK I NEQUALITY

1    Introduction
In what follows, the standard scalar product on Cn is denoted h·, ·i, i. e., hx, yi = ∑ni=1 xi yi for all x, y ∈ Cn .
We always think of Rn as canonically embedded in Cn ; in particular the restriction of h·, ·i to Rn is the
standard scalar product on Rn . Given a set S, the space Mn (S) stands for all the matrices M = (Mi j )ni, j=1
with Mi j ∈ S for all i, j ∈ {1, . . . , n}. Thus, Mn (Mn (R)) is naturally identified with the n4 -dimensional
space of all 4-tensors M = (Mi jkl )ni, j,k,l=1 with Mi jkl ∈ R for all i, j, k, l ∈ {1, . . . , n}. The set of all
n × n orthogonal matrices is denoted On ⊆ Mn (R), and the set of all n × n unitary matrices is denoted
Un ⊆ Mn (C).
    Given M = (Mi jkl ) ∈ Mn (Mn (R)) denote
                                                                     n
                                                def
                                     OptR (M) = sup               ∑ Mi jklUi jVkl
                                                      U,V ∈On i, j,k,l=1

and, similarly, for M = (Mi jkl ) ∈ Mn (Mn (C)) denote
                                                                     n
                                              def
                                   OptC (M) = sup                ∑ Mi jklUi jVkl .
                                                    U,V ∈Un i, j,k,l=1

Theorem 1.1. There exists a polynomial-time algorithm that takes as input M ∈ Mn (Mn (R)) and outputs
U,V ∈ On such that
                                                                 n
                                     OptR (M) ⩽ O(1)            ∑ Mi jklUi jVkl .
                                                             i, j,k,l=1

Analgously, there exists a polynomial-time algorithm that takes as input M ∈ Mn (Mn (C)) and outputs
U,V ∈ Un such that
                                                                 n
                                    OptC (M) ⩽ O(1)             ∑ Mi jklUi jVkl .
                                                             i, j,k,l=1

    We will explain the ideas that go into the proof of Theorem 1.1 later, and it suffices to say at
this juncture that our algorithm is based on a rounding procedure for semidefinite programs that is
markedly different from rounding algorithms that have been previously used in the optimization literature,
and as such it indicates the availability of techniques that have thus far remained untapped for the
purpose of algorithm design. A non-constructive version of Theorem 1.1 was given in earlier works,
as explained below; our contribution here is to design an efficient rounding algorithm and to establish
various applications of it. Before explaining the proof of Theorem 1.1 we list some of its applications as
an indication of its usefulness.
Remark 1.2. The√ implied constants in the O(1) terms of Theorem 1.1 can be taken to be any number
greater than 2 2 in the real case,√and any number greater than 2 in the complex case. There is no
reason to believe that the factor 2 2 in the real case is optimal, but the factor 2 in the complex case is
sharp in a certain natural sense that will become clear later. The main content of Theorem 1.1 is the
availability of a constant-factor algorithm rather than the value of the constant itself. In particular, the
novelty of the applications to combinatorial optimization that are described below is the mere existence
of a constant-factor approximation algorithm.

                      T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                  259
                           A SSAF NAOR , O DED R EGEV, AND T HOMAS V IDICK

1.1     Applications of Theorem 1.1
We now describe some examples demonstrating the usefulness of Theorem 1.1. The first example does
not lead to a new result, and is meant to put Theorem 1.1 in context. All the other examples lead to
new algorithmic results. Many of the applications below follow from a more versatile reformulation of
Theorem 1.1 that is presented in Section 5 (see Proposition 5.1).

1.1.1    The Grothendieck problem
The Grothendieck optimization problem takes as input a matrix A ∈ Mn (R) and aims to efficiently
compute (or estimate) the quantity
                                                          n
                                             max         ∑ Ai j εi δ j .
                                          ε,δ ∈{−1,1}n i, j=1
                                                                                                   (1.1)

This problem falls into the framework of Theorem 1.1 by considering the 4-tensor M ∈ Mn (Mn (R)) given
by Mii j j = Ai j and Mi jkl = 0 if either i 6= j or k 6= l. Indeed,
                                             n                                  n
                      OptR (M) = max        ∑ Ai jUiiV j j =         max       ∑ Ai j εi δ j .
                                   U,V ∈On i, j=1                ε,δ ∈{−1,1}n i, j=1


Here we used the diagonal structure of the tensor M to argue that one may without loss of generality
restrict the variables U,V ∈ On to be diagonal ±1 matrices. While diagonal matrices always commute, in
general U,V ∈ On (R) will not, and it is in this sense that we may think of OptR (·) as a non-commutative
generalization of (1.1).
    A polynomial-time constant-factor approximation algorithm for the Grothendieck problem was
designed in [2], where it was also shown that it is NP-hard to approximate this problem within a factor
less that 1 + ε0 for some ε0 ∈ (0, 1). A simple transformation [2] relates the Grothendieck problem
to the Frieze-Kannan Cut Norm problem [12] (this transformation can be made to have no loss in the
approximation guarantee [26, Sec. 2.1]), and as such the constant-factor approximation algorithm for
the Grothendieck problem has found a variety of applications in combinatorial optimization; see the
survey [26] for much more on this topic. In another direction, based on important work of Tsirelson [45],
the Grothendieck problem has found applications to quantum information theory [7]. Since the problem
of computing OptR (·) contains the Grothendieck problem as a special case, Theorem 1.1 encompasses all
of these applications, albeit with the approximation factor being a larger constant.

1.1.2    Robust PCA
The input to the classical principal component analysis (PCA) problem is K, n ∈ N a set of points
a1 , . . . , aN ∈ Rn . The goal is to find a K-dimensional subspace maximizing the sum of the squared `2
norms of the projections of the ai on the subspace. Equivalently, the problem is to find the maximizing
vectors in
                                                     N   K
                                          max n ∑ ∑ |hai , y j i|2 ,                               (1.2)
                                       y1 ,...,yK ∈R i=1 j=1
                                        hyi ,y j i=δi j


                   T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                          260
             E FFICIENT ROUNDING FOR THE N ONCOMMUTATIVE G ROTHENDIECK I NEQUALITY

where here, and in what follows, δi j is the Kronecker delta. This question has a closed-form solution in
terms of the singular values of the N × n matrix whose i-th row contains the coefficients of the point ai .
    The fact that the quantity appearing in (1.2) is the maximum of the sum of the squared norms of the
projected points makes it somewhat non-robust to outliers, in the sense that a single long vector can have
a large effect on the maximum. Several more robust versions of PCA were suggested in the literature.
One variant, known as “R1-PCA,” is due to Ding, Zhou, He, and Zha [10], and aims to maximize the sum
of the Euclidean norms of the projected points, namely,
                                                       N  K                   1/2
                                          max n ∑
                                       y1 ,...,yK ∈R i=1
                                                            ∑ |hai , y j i|2          .                       (1.3)
                                                            j=1
                                        hyi ,y j i=δi j


We are not aware of any prior efficient algorithm for this problem that achieves a guaranteed approximation
factor. Another robust variant of PCA, known as “L1-PCA,” was suggested by Kwak [29], and further
studied by McCoy and Tropp [32] (see Section 2.7 in [32] in particular). Here the goal is to maximize the
sum of the `1 norms of the projected points, namely,
                                                            N    K
                                               max n ∑ ∑ |hai , y j i| .                                      (1.4)
                                           y1 ,...,yK ∈R i=1 j=1
                                            hyi ,y j i=δi j


In [32] a constant-factor approximation algorithm for the above problem is obtained for K = 1 based
on [2], and for general K an approximation algorithm with an approximation guarantee of O(log n) is
obtained based on prior work by So [43].
    In Section 5.1 we show that both of the above robust versions of PCA can be cast as special cases
of Theorem 1.1, thus yielding constant-factor approximation algorithms for both problems and all
K ∈ {1, . . . , n}. The key observation is that both (1.3) and (1.4) can be expressed as bilinear optimization
problems over a pair of orthogonal matrices. For instance, in the case of (1.4) one of the matrix variables
will have the yi as its columns, while the other will be diagonal, with diagonal entries (in an optimal
solution) matching the sign of hai , y j i.

1.1.3   The orthogonal Procrustes problem
Let n, d ⩾ 1 and K ⩾ 2 be integers. Suppose that S1 , . . . , SK ⊆ Rd are n-point subsets of Rd . The goal of
the generalized orthogonal Procrustes problem is to rotate each of the Sk separately so as to best align
them. Formally, write Sk = {x1k , x2k , . . . , xnk }. The goal is to find K orthogonal matrices U1 , . . . ,UK ∈ Od
that maximize the quantity
                                                   n       K         2
                                                  ∑ ∑ Uk xik 2 .                                              (1.5)
                                                  i=1      k=1

    If one focuses on a single summand appearing in (1.5), say ∑Kk=1 Uk x1k , then it is clear that in order
to maximize its length one would want to rotate each of the x1k so that they would all point in the same
direction, i. e., they would all be positive multiples of the same vector. The above problem aims to achieve
the best possible such alignment (in aggregate) for multiple summands of this type. We note that by

                     T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                   261
                            A SSAF NAOR , O DED R EGEV, AND T HOMAS V IDICK

expanding the squares one sees that U1 , . . . ,UK ∈ Od maximize the quantity appearing in (1.5) if and only
if they minimize the quantity
                                           n   K
                                                                    2
                                          ∑ ∑ Uk xik −Ul xil 2 .
                                          i=1 k,l=1
    The term “generalized” was used above because the orthogonal Procrustes problem refers to the
case K = 2, which has a closed-form solution. The name “Procustes” is a (macabre) reference to Greek
mythology (see http://en.wikipedia.org/wiki/Procrustes).
    The generalized orthogonal Procrustes problem has been extensively studied since the 1970s, initially
in the psychometric literature (see, e. g., [6, 14, 3]), and more recent applications of it are to areas such
as image and shape analysis, market research, and biometric identification; see the books [15, 11], the
lecture notes [44], and [33] for much more information on this topic.
    The generalized orthogonal Procrustes problem is known to be intractable, and it has been investigated
algorithmically in, e. g., [3, 4, 42]. A rigorous analysis of a polynomial-time approximation algorithm
for this problem appears in the work of Nemirovski [35], where the generalized orthogonal Procrustes
problem is treated as an important special case of a more general family√of problems called “quadratic
optimization under orthogonality constraints,” for which he obtains a O( 3 n + d + log K) approximation
algorithm. This was subsequently improved by So [43] to O(log(n + d + K)). In Section 5.2 we use
Theorem 1.1 to improve the approximation guarantee for the generalized orthogonal Procrustes problem
as defined above to a constant approximation factor. See also Section 5.2 for a more complete discussion
of variants of this problem considered in [35, 43] and how they compare to our work.

1.1.4   A Frieze-Kannan decomposition for 4-tensors
In [12] Frieze and Kannan designed an algorithm which decomposes every (appropriately defined)
“dense” matrix into a sum of a few “cut matrices” plus an error matrix that has small cut-norm. We
refer to [12] and also Section 2.1.2 in the survey [26] for a precise formulation of this statement, as well
as its extension, due to [2], to an algorithm that allows sub-constant errors. In Section 5.3 we apply
Theorem 1.1 to prove the following result, which can be viewed as a noncommutative variant of the
Frieze-Kannan decomposition. For the purpose of the statement below it is convenient to identify the
space Mn (Mn (C)) of all 4-tensors with Mn (C) ⊗ Mn (C). Also, for M ∈ Mn (C) ⊗ Mn (C) we denote from
now on its Frobenius (Hilbert-Schmidt) norm by
                                                  s
                                                           n
                                               def
                                        kMk2 =            ∑ |Mi jkl |2 .
                                                       i, j,k,l=1

Theorem 1.3. There exists a universal constant c ∈ (0, ∞) with the following property. Suppose that
M ∈ Mn (C) ⊗ Mn (C) and 0 < ε ⩽ 1/2, and let
                                                 cn2 kMk22
                                                           
                                        def
                                      T =                     .                               (1.6)
                                              ε 2 OptC (M)2
One can compute in time poly(n, 1/ε) a decomposition
                                                   T
                                         M = ∑ αt (At ⊗ Bt ) + E                                       (1.7)
                                                t=1


                    T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                             262
             E FFICIENT ROUNDING FOR THE N ONCOMMUTATIVE G ROTHENDIECK I NEQUALITY

such that At , Bt ∈ Un , the coefficients αt ∈ C satisfy |αt | ⩽ kMk2 /n, and OptC (E) ⩽ εOptC (M). Moreover,
if M ∈ Mn (R) ⊗ Mn (R) then one can replace OptC (M) in (1.6) by OptR (M), take the coefficients αt to be
real, At , Bt ∈ On and E such that OptR (E) ⩽ εOptR (M).
    Theorem 1.3 contains as a special case its commutative counterpart, as studied in [12, 2]. Here we are
given A ∈ Mn (R) with |ai j | ⩽ 1 for all i, j ∈ {1, . . . , n}, and we aim for an error εn2 . Define Mii j j = ai j
and Mi jkl = 0 if i 6= j or k 6= l. Then kMk2 ⩽ n. An application of Theorem 1.3 (in the real case) with
ε replaced by εn2 /OptR (M) yields a decomposition A = ∑t=1          T
                                                                         αt (at bt∗ ) + E with at , bt ∈ [−1, 1]n and
E ∈ Mn (R) satisfying
                                                       n
                                             sup
                                          ε,δ ∈{−1,1}
                                                      ∑ Ei j εi δ j ⩽ εn2 .
                                                     i j=1

Moreover, the number of terms is T = O(1/ε 2 ).
    Theorem 1.3 is proved in Section 5.3 via an iterative application of Theorem 1.1, following the
“energy decrement” strategy as formulated by Lovász and Szegedy [31] in the context of general weak
regularity lemmas. Other than being a structural statement of interest in its own right, we show in
Section 5.3 that Theorem 1.3 can be used to enhance the constant-factor approximation of Theorem 1.1
to a PTAS for computing OptC (M) when OptC (M) = Ω(nkMk2 ). Specifically, if OptC (M) ⩾ κnkMk2
then one can compute a (1 + ε)-factor approximation to OptC (M) in time 2poly(1/(κε)) poly(n). This is
reminiscent of the Frieze-Kannan algorithmic framework [12] for dense graph and matrix problems.

1.1.5   Quantum XOR games
As we already noted, the Grothendieck problem (recall Section 1.1.1) also has consequences in quantum
information theory [7], and more specifically to bounding the power of entanglement in so-called “XOR
games,” which are two-player one-round games in which the players each answer with a bit and the
referee bases her decision on the XOR of the two bits. As will be explained in detail in Section 1.2 below,
the literature on the Grothendieck problem relies on a classical inequality of Grothendieck [16], while our
work relies on a more recent yet by now classical noncommutative Grothendieck inequality of Pisier [36]
(and its sharp form due to Haagerup [18]). Even more recently, the Grothendieck inequality has been
generalized to another setting, that of completely bounded linear maps defined on operator spaces [38, 21].
While we do not discuss the operator-space Grothendieck inequality here, we remark that in [40] the
operator-space Grothendieck inequality is proved by reducing it to the Pisier-Haagerup noncommutative
Grothendieck inequality. Without going into details, we note that this reduction is also algorithmic.
Combined with our results, it leads to an algorithmic proof of the operator-space Grothendieck inequality,
together with an accompanying rounding procedure.
     In [41], the last two named authors show that the noncommutative and operator-space Grothendieck
inequalities have consequences in a setting that generalizes that of classical XOR games, called “quantum
XOR games”: in such games, the questions to the players may be quantum states (and the answers are
still a single classical bit). In [41], efficient constant-factor approximation algorithms are derived for the
maximum success probability of players in such a game, in three settings: players sharing an arbitrary
quantum state, players sharing a maximally entangled state, and players not sharing any entanglement.
Using Theorem 1.1, one can efficiently compute strategies achieving the same approximation guarantee.
These matters are taken up in [41] and will not be discussed further here.

                      T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                   263
                                   A SSAF NAOR , O DED R EGEV, AND T HOMAS V IDICK

1.2   The noncommutative Grothendieck inequality
The natural semidefinite relaxation of (1.1) is
                                                                       n
                                                  sup       sup       ∑ Ai j hxi , y j i ,
                                                 d∈N x,y∈(Sd−1 )n i, j=1
                                                                                                                            (1.8)


where Sd−1 is the unit sphere of Rd . (The dimension d can be taken to be at most the number of vector
variables, d ⩽ 2n.) Since, being a semidefinite program (SDP), the quantity appearing in (1.8) can be
computed in polynomial time with arbitrarily good precision (see [17]), the fact that the Grothendieck
optimization problem admits a polynomial-time constant-factor approximation algorithm follows from
the following inequality, which is a classical inequality of Grothendieck of major importance to several
mathematical disciplines (see Pisier’s survey [37] and the references therein for much more on this topic;
the formulation of the inequality as below is due to Lindenstrauss and Pełczyński [30]):
                                                  n                                                n
                             sup       sup       ∑ Ai j hxi , y j i ⩽ KG ε,δ ∈{−1,1}
                             d∈N x,y∈(Sd−1 )n i, j=1
                                                                              sup    ∑ Ai j εi δ j .
                                                                                             n
                                                                                                                            (1.9)
                                                                                                 i, j=1

Here KG ∈ (0, ∞), which is understood to be the infimum over those constants for which (1.9) holds true
for all n ∈ N and all A ∈ Mn (R), is a universal constant known as the (real) Grothendieck constant. Its
exact value remains unknown, the best available bounds [39, 5] being 1.676 < KG < 1.783. In order to
actually find an assignment ε, δ to (1.1) that is within a constant factor of the optimum one needs to argue
that a proof of (1.9) can be turned into an efficient rounding algorithm; this is done in [2].
    If one wishes to mimic the above algorithmic success of the Grothendieck inequality in the context
of efficient computation of OptR (·), the following natural strategy presents itself: one should replace
real entries of matrices by vectors in `2 , i. e., consider elements of Mn (`2 ), and replace the orthogonality
constraints underlying the inclusion U ∈ On , namely,
                                                                  n                   n
                                   ∀ i, j ∈ {1, . . . , n},     ∑ UikU jk = ∑ UkiUk j = δi j ,
                                                                k=1              k=1

by the corresponding constraints using scalar product. Specifically, given an n × n vector-valued matrix
X ∈ Mn (`2 ) define two real matrices XX ∗ , X ∗ X ∈ Mn (R) by
                                                       n                                                         n
                                                def                                                       def
        ∀ i, j ∈ {1, . . . , n},    (XX ∗ )i j = ∑ hXik , X jk i            and           (X ∗ X)i j = ∑ hXki , Xk j i ,   (1.10)
                                                      k=1                                                       k=1

and let the set of d-dimensional vector-valued orthogonal matrices be given by
                                          n                             o
                                      def
                             On (Rd ) = X ∈ Mn (Rd ) : XX ∗ = X ∗ X = I .                                                  (1.11)

One then considers the following quantity associated to every M ∈ Mn (Mn (R)):
                                                                             n
                                                 def
                                   SDPR (M) = sup               sup         ∑ Mi jkl Xi j ,Ykl .                           (1.12)
                                                       d∈N X,Y ∈On (Rd ) i, j,k,l=1


                        T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                             264
            E FFICIENT ROUNDING FOR THE N ONCOMMUTATIVE G ROTHENDIECK I NEQUALITY

Since the constraints that underlie the inclusion X,Y ∈ On (Rd ) are linear equalities in the pairwise scalar
products of the entries of X and Y , the quantity SDPR (M) is a semidefinite program and can therefore
be computed in polynomial time with arbitrarily good precision. One would therefore aim to prove the
following noncommutative variant of the Grothendieck inequality (1.9):

                       ∀ n ∈ N, ∀ M ∈ Mn (Mn (R)),            SDPR (M) ⩽ O(1) · OptR (M).                     (1.13)

The term “noncommutative” refers here to the fact that OptR (M) is an optimization problem over
the noncommutative group On , while the classical Grothendieck inequality addresses an optimization
problem over the commutative group {−1, 1}n . In the same vein, noncommutativity is manifested by the
fact that the classical Grothendieck inequality corresponds to the special case of “diagonal” 4-tensors
M ∈ Mn (Mn (R)), i. e., those that satisfy Mi jkl = 0 whenever i 6= j or k 6= l.
    Grothendieck conjectured [16] the validity of (1.13) in 1953, a conjecture that remained open until its
1978 affirmative solution by Pisier [36]. A simpler, yet still highly nontrivial proof of the noncommutative
Grothendieck inequality (1.13) was obtained by Kaijser [25]. In Section 4 we design a rounding algorithm
corresponding to (1.13) based on Kaijser’s approach. This settles the case of real 4-tensors of Theorem 1.1,
albeit with worse approximation guarantee than the one claimed in Remark 1.2. The algorithm modeled
on Kaijser’s proof is interesting in its own right, and seems to be versatile and applicable to other problems,
such as possible non-bipartite extensions of the noncommutative Grothendieck inequality in the spirit
of [1]; we shall not pursue this direction here.
    A better approximation guarantee, and arguably an even more striking rounding algorithm, arises
from the work of Haagerup [18] on the complex version of (1.13). In Section 3 we show how the real
case of Theorem 1.1 follows formally from our results on its complex counterpart, so from now on we
focus our attention on the complex case.

1.2.1   The complex case
                        d−1                                     0 can be identified with the unit circle
In what follows we let SC   denote the unit sphere of Cd (thus SC
 1     2
S ⊆ R ). The classical complex Grothendieck inequality [16, 30] asserts that there exists K ∈ (0, ∞)
such that
                                                    n                                           n
        ∀ n ∈ N, ∀ A ∈ Mn (C),        sup          ∑ Ai j hxi , y j i ⩽ O(1)      sup       ∑ Ai j αi β j .   (1.14)
                                        2n−1 n                                        0 )n
                                                                               α,β ∈(SC
                                  x,y∈(SC   ) i, j=1                                       i, j=1

Let KGC denote the infimum over those K ∈ (0, ∞) for which (1.14) holds true. The exact value of KGC
remains unknown, the best available bounds being 1.338 < KGC < 1.4049 (the left inequality is due to
unpublished work of Davie, and the right one is due to Haagerup [19]).
    For M ∈ Mn (Mn (C)) we define
                                                                    n
                                       def
                           SDPC (M) = sup               sup        ∑ Mi jkl Xi j ,Ykl       ,                 (1.15)
                                             d∈N X,Y ∈Un (Cd ) i, j,k,l=1

where analogously to (1.11) we set
                                           n                            o
                                       def
                              Un (Cd ) = X ∈ Mn (Cd ) : XX ∗ = X ∗ X = I .

                     T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                   265
                            A SSAF NAOR , O DED R EGEV, AND T HOMAS V IDICK

Here for X ∈ Mn (Cd ) the complex matrices XX ∗ , X ∗ X ∈ Mn (C) are defined exactly as in (1.10), with the
scalar product being the complex scalar product. Haagerup proved [18] that

                         ∀ n ∈ N, ∀ M ∈ Mn (Mn (C)),               SDPC (M) ⩽ 2 · OptC (M) .            (1.16)

Our main algorithm is an efficient rounding scheme corresponding to inequality (1.16). The constant 2
in (1.16) is sharp, as shown in [20] (see also [37, Sec. 12]).
    We note that the noncommutative Grothendieck inequality, as it usually appears in the literature,
involves a slightly more relaxed semidefinite program. In order to describe it, we first remark that instead
of maximizing over X,Y ∈ Un (Cd ) in (1.15) we could equivalently maximize over X,Y ∈ Mn (Cd ) satis-
fying XX ∗ , X ∗ X,YY ∗ ,Y ∗Y ⩽ I, which is the same as the requirement kXX ∗ k, kX ∗ Xk, kYY ∗ k, kY ∗Y k ⩽ 1,
where here and in what follows k · k denotes the operator norm of matrices. This fact is made formal
in Lemma 2.2 below. By relaxing the constraints to kXX ∗ k + kX ∗ Xk ⩽ 2 and kYY ∗ k + kY ∗Y k ⩽ 2, we
obtain the following quantity, which can be shown to still be a semidefinite program:
                                                                       n
                                  def
                           kMknc = sup               sup              ∑ Mi jkl Xi j ,Ykl   .            (1.17)
                                        d∈N     X,Y ∈Mn (Cd )   i, j,k,l=1
                                              kXX ∗ k+kX ∗ Xk⩽2
                                              kYY ∗ k+kY ∗Y k⩽2

Clearly kMknc ⩾ SDPC (M) for all M ∈ Mn (Mn (C)). Haagerup proved [18] that the following stronger
inequality holds true for all n ∈ N and M ∈ Mn (Mn (C)):

                                               kMknc ⩽ 2 · OptC (M) .                                   (1.18)

As our main focus is algorithmic, in the following discussion we will establish a rounding algorithm for
the tightest relaxation (1.16). In Section 2.3 we show that the same rounding procedure can be used to
obtain an algorithmic analogue of (1.18) as well.

1.2.2   The rounding algorithm
Our main algorithm is an efficient rounding scheme corresponding to (1.16). In order to describe it, we
first introduce the following notation. Let ϕ : R → R+ be given by

                                         def 1                         1
                                                          π 
                                  ϕ(t) =           sech      t = πt/2          .                        (1.19)
                                               2           2    e     + e−πt/2
                     R
One computes that R ϕ(t)dt = 1, so ϕ is a density of a probability measure µ on R, known as the
hyperbolic secant distribution. By [23, Sec. 23.11] we have
                                                                                 2a
                                                           Z
                                   ∀ a ∈ (0, ∞),                ait ϕ(t)dt =          .                 (1.20)
                                                            R                  1 + a2
It is possible to efficiently sample from µ using standard techniques; see, e. g., [8, Ch. IX.7].
     In what follows, given X ∈ Mn (Cd ) and z ∈ Cd we denote by hX, zi ∈ Mn (C) the matrix whose entries
are hX, zi jk = hX jk , zi.

                    T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                              266
            E FFICIENT ROUNDING FOR THE N ONCOMMUTATIVE G ROTHENDIECK I NEQUALITY


                                               Rounding procedure

   1. Let X,Y ∈ Mn (Cd ) be given as input. Choose z ∈ {1, −1, i, −i}d uniformly at random, and sample
      t ∈ R according to the hyperbolic secant distribution µ.
             def 1                      def 1
   2. Set Xz = √ hX, zi ∈ Mn (C) and Yz = √ hY, zi ∈ Mn (C).
                  2                          2
   3. Output the pair of matrices
                                                         def
                          (A, B) = (A(z,t), B(z,t)) = (Uz |Xz |it ,Vz |Yz |−it ) ∈ Un × Un ,

      where Xz = Uz |Xz | and Yz = Vz |Yz | are the polar decompositions of Xz and Yz , respectively.



Figure 1: The rounding algorithm takes as input a pair of vector-valued matrices X,Y ∈ Mn (Cd ). It
outputs two matrices A, B ∈ Un (C).


Theorem 1.4. Fix n, d ∈ N and ε ∈ (0, 1). Suppose that M ∈ Mn (Mn (C)) and that X,Y ∈ Un (Cd ) are
such that
                                     n
                                    ∑ Mi jkl hXi j ,Ykl i ⩾ (1 − ε)SDPC (M) ,                           (1.21)
                                 i, j,k,l=1

where SDPC (M) is given in (1.15). Then the rounding procedure described in Figure 1 outputs a pair of
matrices A, B ∈ Un such that
                             h           n               i         1      
                            E        ∑ Mi jkl Ai j Bkl         ⩾        − ε SDPC (M) .                  (1.22)
                                  i, j,k,l=1                        2

Moreover, rounding can be performed in time polynomial in n and log(1/ε), and can be derandomized in
time poly(n, 1/ε).

    While the rounding procedure of Figure 1 and the proof of Theorem 1.4 (contained in Section 2 below)
appear to be different from Haagerup’s original proof of (1.18) in [18], we derived them using Haagerup’s
ideas. One source of difference arises from changes that we introduced in order to work with the quantity
SDPC (M), while Haagerup’s argument treats the quantity kMknc . A second source of difference is that
Haagerup’s proof of (1.18) is rather indirect and nonconstructive, while it is crucial to the algorithmic
applications that were already mentioned in Section 1.1 for us to formulate a polynomial-time rounding
procedure. Specifically, Haagerup establishes the dual formulation of (1.18), through a repeated use of
duality, and he uses a bootstrapping argument that relies on nonconstructive tools from complex analysis.
The third step in Figure 1 originates from Haagerup’s complex-analytic considerations. Readers who are
accustomed to semidefinite rounding techniques will immediately notice that this step is unusual; we
give intuition for it in Section 1.2.3 below, focusing for simplicity on applying the rounding procedure to
vectors rather than matrices (i. e., the more familiar setting of the classical Grothendieck inequality).

                    T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                              267
                             A SSAF NAOR , O DED R EGEV, AND T HOMAS V IDICK

1.2.3   An intuitive description of the rounding procedure in the commutative case

Consider the effect of the rounding procedure in the commutative case, i. e., when X,Y ∈ Mn (Cd ) are
diagonal matrices. Let the diagonals of X,Y be x, y ∈ (Cd )n , respectively.√         The first step consists
                                                                                                       √      in
performing a random projection: for j ∈ {1, . . . , n} let α j = x j , z / 2 ∈ C and β j = y j , z / 2 ∈ C,
where z is chosen uniformly at random from {1, −1, i, −i}n (alternatively, with minor modifications to
the proof one may choose i. i. d. z j uniformly from the unit circle, as was done by Haagerup [18], or
use standard complex Gaussians). This step results in sequences of complex numbers whose pairwise
products αk β j , in expectation, exactly reproduce the pairwise scalar products hxk , y j i/2. However, in
general the resulting complex numbers αk and β j may have modulus larger than 1. Extending the “sign”
rounding performed in, say, the Goemans-Williamson algorithm for M AX C UT [13] to the complex
domain, one could then round each αk and β j independently by simply replacing them by their respective
complex phase.
    The procedure that we consider differs from this standard practice by taking into account potential
information contained in the modulus of the random complex numbers αk , β j . Writing in polar coordinates
αk = rk eiθk and β j = s j eiφ j we sample a real t according to a specific distribution (the hyperbolic secant
distribution µ), and round each αk and each β j to

                          def                                     def
                       ak = ei(θk +t log rk ) ∈ SC
                                                 0
                                                   ,   and     b j = ei(φ j −t log s j ) ∈ SC
                                                                                            0
                                                                                              ,

respectively. Observe that this step performs a correlated rounding: the parameter t is the same for all
 j, k ∈ {1, . . . , n}.
      The proof presented in [18] uses the maximum modulus principle to show the existence of a real
t for which ak , b j as defined above provide a good assignment. Intuition for the existence of such a
good t can be given as follows. Varying t along the real line corresponds to rotating the phases of the
complex numbers α j , βk at a speed proportional to the logarithm of their modulus: elements with very
small modulus vary very fast, those with modulus 1 are left unchanged, and elements with relatively large
modulus are again varied at (logarithmically) increasing speeds. This means that the rounding procedure
takes into account the fact that an element with modulus away from 1 is a “miss”: that particular element’s
phase is probably irrelevant, and should be changed. However, elements with modulus close to 1 are
“good”: their phase can be kept essentially unchanged.
      We identify a specific distribution µ such that a random t distributed according to µ is good, in
expectation. This results in a variation on the usual “sign” rounding technique: instead of directly keeping
the phases obtained in the initial step of random projection, they are synchronously rotated for a random
time t, at speeds depending on the associated moduli, resulting in a provably good pair of sequences
ak , b j of complex numbers with modulus 1.


Roadmap: In Section 2 we prove Theorem 1.4 both as stated in Section 1.2.2 and in a form based
on (1.18). The real case as well as a closely related Hermitian case are treated next, first in Section 3 as a
corollary of Theorem 1.4, and then using an alternative direct rounding procedure in Section 4. Section 5
presents the applications that were outlined in Section 1.1.

                     T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                              268
             E FFICIENT ROUNDING FOR THE N ONCOMMUTATIVE G ROTHENDIECK I NEQUALITY

2     Proof of Theorem 1.4
In this section we prove Theorem 1.4. The rounding procedure described in Figure 1 is analyzed in
Section 2.1, while the derandomized version is presented in Section 2.2. The efficiency of the procedure
is clear; we refer to Section 2.2 for a discussion on how to discretize the choice of t. In Section 2.3 we
show how the analysis can be modified to the case of kMknc and (1.18).
     In what follows, it will be convenient to use the following notation. Given M ∈ Mn (Mn (C)) and
X,Y ∈ Mn (Cd ), define
                                                             n
                                                   def
                                     M(X,Y ) =              ∑ Mi jkl Xi j ,Ykl ∈ C .                     (2.1)
                                                         i, j,k,l=1

Thus M(·, ·) is a sesquilinear form on Mn (Cd ) × Mn (Cd ), i. e., M(αX, βY ) = αβ M(X,Y ) for all X,Y ∈
Mn (Cd ) and α, β ∈ C. Observe that if A, B ∈ Mn (C) then
                                         n                                  n                 
                         M(A, B) =      ∑         Mi jkl Ai j Bkl =        ∑ Mi jkl A ⊗ B (i j),(kl) .   (2.2)
                                     i, j,k,l=1                         i, j,k,l=1


2.1    Analysis of the rounding procedure
Proof of (1.22). Let X,Y ∈ Un (Cd ) be vector-valued matrices satisfying (1.21). Let z ∈ {1, −1, i, −i}d
be chosen uniformly at random, and
                                    def 1                                          def 1
                                 Xz = √ hX, zi                    and           Yz = √ hY, zi
                                         2                                              2
be random variables taking values in Mn (C) defined as in the second step of the rounding procedure (see
Figure 1). Then,

                               1 h d            n                   i 1
                  Ez M(Xz ,Yz ) = Ez ∑ zr zs ∑ Mi jkl (Xi j )r (Ykl )s = M(X,Y ) ,                       (2.3)
                                 2   r,s=1  i, j,k,l=1                  2

where we used the fact that E[zr zs ] = δrs for every r, s ∈ {1, . . . , d}.
   Observe that (1.20) implies that

                                    ∀ a ∈ (0, ∞) ,           Et [ait ] = 2a − Et [a2+it ] .

Recall that the output of our rounding scheme as described in Figure 1 is A = Uz |Xz |it and B = Vz |Yz |−it ,
where Xz = Uz |Xz | and Yz = Vz |Yz | are the polar decompositions of Xz and Yz , respectively. Applying the
above identity to the the (positive) eigenvalues of |Xz | ⊗ |Yz |, we deduce the matrix equality

                   Et A ⊗ B = Et Uz |Xz |it ⊗ Vz |Yz |it
                                                              
                                                  h                  i
                                 = Uz ⊗Vz · Et (|Xz | ⊗ |Yz |)it
                                 = 2Uz |Xz | ⊗Vz |Yz | − Et Uz |Xz |2+it ⊗ Vz |Yz |2+it
                                                                                      
                                                     h                         i
                                 = 2Xz ⊗Yz − Et Uz |Xz |2+it ⊗ Vz |Yz |2−it .                          (2.4)

                      T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                            269
                            A SSAF NAOR , O DED R EGEV, AND T HOMAS V IDICK

    It follows from (2.1), (2.2), (2.3) and (2.4) that

                      Ez,t M(A, B) = M(X,Y ) − Ez,t M Uz |Xz |2+it ,Vz |Yz |2−it .
                                                                             
                                                                                                           (2.5)

Our goal from now on is to bound the second, “error” term on the right-hand side of (2.5). Specifically,
the rest of the proof is devoted to showing that for any fixed t ∈ R we have
                                                                  1
                              Ez M Uz |Xz |2+it ,Vz |Yz |2−it
                                                             
                                                                 ⩽ SDPC (M) .                              (2.6)
                                                                  2
Once established, the estimate (2.6) completes the proof of the desired expectation bound (1.22) since
                                                                            
                             (2.5)∧(2.6)         1            (1.21) 1
              Ez,t |M(A, B)|       ⩾ M(X,Y ) − SDPC (M) ⩾                 − ε SDPC (M) .
                                                   2                    2
So, for the rest of the proof, fix some t ∈ R. As a first step towards (2.6) we state the following claim.
Claim 2.1. Let W ∈ Mn (Cd ) be a vector-valued matrix, and for every r ∈ {1, . . . , d} define Wr ∈ Mn (C)
by (Wr )i j = (Wi j )r . Let z ∈ {1, −1, i, −i}d be chosen uniformly at random. Writing Wz = hW, zi ∈ Mn (C),
we have
                                                   d
                         Ez (WzWz∗ )2 = (WW ∗ )2 + ∑ Wr (W ∗W −Wr∗Wr )Wr∗ ,
                                    
                                                                                                           (2.7)
                                                                  r=1
                                                                   d
                         Ez (Wz∗Wz )2 = (W ∗W )2 + ∑ Wr∗ (WW ∗ −WrWr∗ )Wr .
                                    
                                                                                                           (2.8)
                                                                  r=1

Proof. By definition Wz = ∑dr=1 zrWr , and recalling (1.10) we have
                                         d                                          d
                           WW ∗ = ∑ WrWr∗                     and           W ∗W = ∑ Wr∗Wr .
                                       r=1                                         r=1

Consequently,
                                h            d                                 i
              Ez (WzWz∗ )2 = Ez                    z p zq zr zsWpWq∗WrWs∗
                         
                                         ∑
                                       p,q,r,s=1
                                  d
                             = ∑ WpWp∗WpWp∗ +                                 WpWp∗WqWq∗ +WpWq∗WqWp∗
                                                                                                       
                                                                  ∑
                                 p=1                          p,q∈{1,...,d}
                                                                  p6=q
                                  d                               d                      d
                             = ∑ WpWp∗WqWq∗ + ∑ WpWq∗WqWp∗ − ∑ WpWp∗WpWp∗
                                 p,q=1                         p,q=1                     p=1
                                  d               2     d            d            d
                             =        ∑ WpWp∗           + ∑ Wp          ∑ Wq∗Wq Wp∗ − ∑ WpWp∗WpWp∗ ,
                                  p=1                    p=1            q=1                  p=1

proving (2.7). A similar calculation yields (2.8).

                    T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                270
             E FFICIENT ROUNDING FOR THE N ONCOMMUTATIVE G ROTHENDIECK I NEQUALITY

    Now, for every t ∈ R define two vector-valued matrices
                                                               d
                                                                  
                                    F(t), G(t) ∈ Mn C{1,−1,i,−i}

by setting for every j, k ∈ {1, . . . , n} and z ∈ {1, −1, i, −i}d ,
                            def   1                                           def   1
                                      Uz |Xz |2+it jk                                   Vz |Yz |2−it jk .
                                                                                                   
                 (F(t) jk )z =      d
                                                           and      (G(t) jk )z =     d
                                                                                                             (2.9)
                                  2                                                 2
Thus,
                        1
                                         M Uz |Xz |2+it ,Vz |Yz |2−it = Ez M Uz |Xz |2+it ,Vz |Yz |2−it .
                                                                                                     
    M(F(t), G(t)) =      d     ∑                                                                            (2.10)
                        4 z∈{1,−1,i,−i}d

Moreover, recalling that Xz = Uz |Xz | is the polar decomposition of Xz , we have
                             1                                                   h       i
                                                                                       ∗ 2
              F(t)F(t)∗ = d                          4 ∗            4 ∗
                                                                         
                                              U |X  |
                                     ∑ d z z z z z z zU  = E  U |X | U      = Ez   X X
                                                                                    z z     .               (2.11)
                            4 z∈{1,−1,i,−i}
                             h             2 i
Similarly F(t)∗ F(t) = Ez         Xz∗ Xz       , so that an application of Claim 2.1 with W = √12 X yields, using
XX ∗ = X ∗ X = I since X ∈ Un (Cd ),

                                      1 d                                1 d ∗               1
                     F(t)F(t)∗ +        ∑   Xr Xr
                                                 ∗
                                                   Xr Xr
                                                        ∗
                                                          = F(t)∗
                                                                  F(t) +   ∑   Xr Xr Xr∗ Xr = I.            (2.12)
                                      4 r=1                              4 r=1               2

Analogously,
                                       1 d                            1 d            1
                      G(t)G(t)∗ +        ∑   YrYr∗YrYr∗ = G(t)∗ G(t) + ∑ Yr∗YrYr∗Yr = I.                    (2.13)
                                       4 r=1                          4 r=1          2
Using that each term Xr∗ Xr Xr∗ Xr and Yr∗YrYr∗Yr is positive semidefinite, the two equations above imply
that F(t), G(t) satisfy the norm bounds
                                                                                1
                    max kF(t)F(t)∗ k, kF(t)∗ F(t)k, kG(t)G(t)∗ k, kG(t)∗ G(t)k ⩽ .
                       
                                                                                                            (2.14)
                                                                                2
As shown in Lemma 2.2 below, (2.14) implies that there exists a pair of vector-valued matrices
                                                                       2
                                                 R(t), S(t) ∈ Un (Cd+2n )

such that                                                √      √
                                       M(R(t), S(t)) = M( 2F(t), 2G(t)) .                                   (2.15)
(This fact can also be derived directly using (2.12) and (2.13).) Recalling the definition of SDPC (M)
in (1.15), it follows that for every t ∈ R,
                                       (2.10)             (2.15) 1            1
     Ez M Uz |Xz |2+it ,Vz |Yz |2−it
       
                                           = |M(F(t), G(t))| = |M(R(t), S(t))| ⩽ SDPC (M) ,                 (2.16)
                                                                   2            2
completing the proof of (2.6).

                      T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                271
                            A SSAF NAOR , O DED R EGEV, AND T HOMAS V IDICK

Lemma 2.2. Let X,Y ∈ Mn (Cd ) be such that max(kX ∗ Xk, kXX ∗ k, kY ∗Y k, kYY ∗ k) ⩽ 1. Then there exist
                2
R, S ∈ Un (Cd+2n ) such that for every M ∈ Mn (Mn (C)) we have M(R, S) = M(X,Y ). Moreover, R and S
can be computed from X and Y in time poly(n, d).
Proof. Let A = I − XX ∗ and B = I − X ∗ X, and note that A, B ⩾ 0 and Tr(A) = Tr(B). Write the spectral
decompositions of A and B as
                                       n                            n
                              A = ∑ λi (ui u∗i )    and      B = ∑ µ j (v j v∗j ) ,
                                      i=1                          j=1

respectively. Set σ = ∑ni=1 λi = ∑nj=1 µ j , and define
                                  n
                                     r
                    def
                            M         λi µ j                                 2    2
                  R=X⊕                        (ui v∗j ) ⊕ 0M (Cn2 ) ∈ Mn (Cd ⊕ Cn ⊕ Cn ) .
                                        σ                   n
                              i, j=1

                                                                                                     2
With this definition we have RR∗ = XX ∗ + A = I and R∗ R = X ∗ X + B = I, so R ∈ Un (Cd+2n ). Let
               2
S ∈ Un (Cd+2n ) be defined analogously from Y , with the last two blocks of n2 coordinates permuted. One
checks that M(R, S) = M(X,Y ), as required.
    Finally, A, B, their spectral decomposition, and the resulting R, S can all be computed in time poly(n, d)
from X,Y .

2.2   Derandomized rounding
Note that we can always assume that X,Y ∈ Un (Cd ), where√   d ⩽ 2n2 . We start by slightly changing the
projection step. Define Xz0 to be the projection Xz = (1/ 2)hX, zi, after we replace all singular values of
Xz that are smaller than ε with ε. Then, writing
                                  0
                         2Xz0 ⊗Yz = 2Xz ⊗Yz + 2(Xz0 − Xz ) ⊗Yz + 2Xz0 ⊗ (Yz0 −Yz ) ,

we see that in the analogue of (2.5) the first term is at least M(X,Y ) − 4εdSDPC (M). Here we use that
kXz k, kYz k ⩽ d which follows by the triangle inequality and X,Y ∈ Un (Cd ). For the second term in (2.5),
the previous analysis remains unchanged, provided we prove an analogue of (2.14). Using (2.11) and the
analogous equations for F(t)∗ F(t), G(t)G(t)∗ and G(t)∗ G(t), it will suffice to bound four expressions
such as
                                             Ez (Xz0 (Xz0 )† )2 .
                                                              

One checks that the modification to the rounding we did can only increase this by ε 4 (even for each
z), hence following the previous analysis we get the bound in (2.16) with SDPC (M)/2 replaced by
(1 + ε 4 )SDPC (M)/2.
    Next, we observe that the coordinates of z need not be independent, and it suffices if they are chosen
from a four-wise independent distribution. As a result, there are only poly(n) possible values of z and
they can be enumerated efficiently. Therefore, we can assume that we have a value z ∈ {1, −1, i, −i}d for
which
                                                               1    
                           Et M Uz |Xz0 |it ,Vz |Yz0 |−it
                                                         
                                                             ⩾    − ε SDPC (M) .                     (2.17)
                                                                2

                    T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                              272
              E FFICIENT ROUNDING FOR THE N ONCOMMUTATIVE G ROTHENDIECK I NEQUALITY

    Notice that for some universal constant c > 0, with probability at least 1 − ε, a sample from the
hyperbolic secant distribution is at most c log(1/ε) in absolute value. Therefore, denoting the restriction
of the hyperbolic secant distribution to the interval [−c log(1/ε), c log(1/ε)] by µ 0 , and using the fact that
the expression inside the expectation in (2.17) is never larger than SDPC (M), for t 0 distributed according
to µ 0 we have
                                               0              0
                                                                        1        
                             Et 0 M Uz |Xz0 |it ,Vz |Yz0 |−it
                                 
                                                                   ⩾        − 2ε SDPC (M) .
                                                                          2
Moreover, for any t,t 0 ∈ R,
                                                          0            0
             M Uz |Xz0 |it ,Vz |Yz0 |−it − M Uz |Xz0 |it ,Vz |Yz0 |−it
                                        
                                               0                                0                      0 
                  = M Uz (|Xz0 |it − |Xz0 |it ),Vz |Yz0 |−it + M Uz |Xz0 |it ,Vz (|Yz0 |−it − |Yz0 |−it )
                                                               
                                               0                                   0                      0 
                  ⩽ M Uz (|Xz0 |it − |Xz0 |it ),Vz |Yz0 |−it + M Uz |Xz0 |it ,Vz (|Yz0 |−it − |Yz0 |−it ) .
                                                               
                                                                                                              (2.18)
The first absolute value in (2.18) is at most
                                         0                                                     0
      SDPC (M) · kUz (|Xz0 |it − |Xz0 |it )k · kVz |Yz0 |−it k = SDPC (M) · k|Xz0 |it − |Xz0 |it k
                                                             ⩽ SDPC (M) · log(max{kXz0 k, k(Xz0 )−1 k}) · |t − t 0 | .
We have kXz0 k ⩽ d as explained above, and k(Xz0 )−1 k ⩽ 1/ε by the way our modified rounding pro-
cedure was defined. Similar bounds hold for Yz0 . It therefore suffices to pick t from a grid of size
O(log(1/ε) max(1/ε 2 , d/ε)).

2.3    The rounding procedure in the case of (1.18)
Theorem 1.4 addressed the performance of the rounding procedure described in Figure 1 with respect to
inequality (1.16). Here we prove that this rounding procedure has the same performance with respect to
the noncommutative Grothendieck inequality (1.18) as well. This is the content of the following theorem.
Theorem 2.3. Fix n, d ∈ N and ε ∈ (0, 1). Suppose that M ∈ Mn (Mn (C)) and that X,Y ∈ Mn (Cd ) satisfy
                                    max {kXX ∗ k + kX ∗ Xk, kYY ∗ k + kY ∗Y k} ⩽ 2                                  (2.19)
and
                                             n
                                             ∑ Mi jkl hXi j ,Ykl i ⩾ (1 − ε)kMknc ,                                 (2.20)
                                        i, j,k,l=1

where kMknc was defined in (1.18). Then the rounding procedure described in Figure 1 outputs a pair of
matrices A, B ∈ Un such that
                              h       n              i 1      
                             E      ∑ Mi jkl Ai j Bkl ⩾ 2 − ε kMknc .                           (2.21)
                                i, j,k,l=1

Proof. We shall explain how to modify the argument presented in Section 2.1, relying on the notation
that was introduced there. All we need to do is to replace (2.16) by the assertion
                                                               1
                                       ∀t ∈ R,
                                            |M(F(t), G(t))| ⩽ kMknc .                              (2.22)
                                                               2
To this end we use the following corollary of Claim 2.1, which is a slight variant of [18, Lem. 4.1].

                        T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                         273
                            A SSAF NAOR , O DED R EGEV, AND T HOMAS V IDICK

Corollary 2.4. Let W ∈ Mn (Cd ) be a vector-valued matrix and z ∈ {1, −1, i, −i}d chosen uniformly at
random. Writing Wz = hW, zi ∈ Mn (C), we have
                                                                       2
                        Ez (WzWz∗ )2 + Ez (Wz∗Wz )2 ⩽ kWW ∗ k + kW ∗W k .
                                                
                                                                                                         (2.23)

Proof. Starting from (2.7) and noting that
                  d                                           d
                 ∑ Wr (W ∗W −Wr∗Wr )Wr∗ ⩽ kW ∗W k · ∑ WrWr∗ = kW ∗W k · kWW ∗ k ,
                r=1                                          r=1

we obtain the inequality

                               Ez (WzWz∗ )2 ⩽ kWW ∗ k2 + kW ∗W k · kWW ∗ k .
                                          
                                                                                                         (2.24)

Similarly, from (2.8) we get

                               Ez (Wz∗Wz )2 ⩽ kW ∗W k2 + kWW ∗ k · kW ∗W k .
                                          
                                                                                                         (2.25)

By summing (2.24) and (2.25) one deduces (2.23).
                                           √
   Combining (2.11) with (2.23) for W = X/ 2, we have

                                                     1                    (2.19)
                        kF(t)F(t)∗ k + kF(t)∗ F(t)k ⩽ (kXX ∗ k + kX ∗ Xk)2 ⩽ 1 .                         (2.26)
                                                     4
Analogously,
                                                        1
                        kG(t)G(t)∗ k + kG(t)∗ G(t)k ⩽ (kYY ∗ k + kY ∗Y k)2 ⩽ 1 .                         (2.27)
                                                        4
Recalling the definition of kMknc , it follows from (2.26) and (2.27) that for every t ∈ R,
                                                        (2.10)               1
                      Ez M Uz |Xz |2+it ,Vz |Yz |2−it
                        
                                                            = |M(F(t), G(t))| ⩽ kMknc .                  (2.28)
                                                                               2
Hence,                                                                           
                                  (2.5)∧(2.28)          1     (2.20)       1
                   Ez,t |M(A, B)|        ⩾       M(X,Y ) − kMknc ⩾             − ε kMknc ,
                                                          2                  2
completing the proof of the desired expectation bound (2.21).

Remark 2.5. The following example, due to Haagerup [18], shows that the factor 2 approximation
guarantee obtained in Theorem 2.3 is optimal: the best constant in (1.18) equals 2. Let M ∈ Mn (Mn (C))
be given by M1 jk1 = δ jk , and Mi jkl = 0p   if (i, l) 6= (1, 1). A direct computation shows that OptC (M) = 1.
Define X,Y ∈ Mn (Cn ) by X1 j = Y j1 = 2/(n + 1) e j ∈ Cn for j ∈ {1, . . . , n} and all other entries of X
and Y vanish (here e1 , . . . , en is the standard basis of Cn ). Using these two vector-valued matrices one
shows that kMknc ⩾ 2n/(n + 1). Recall that in the introduction we mentioned that it was shown in [20]
that the best constant in the weaker inequality (1.16) is also 2, but the example exhibiting this stronger
fact is more involved.

                      T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                             274
             E FFICIENT ROUNDING FOR THE N ONCOMMUTATIVE G ROTHENDIECK I NEQUALITY

3    The real and Hermitian cases
The n × n Hermitian matrices are denoted Hn . A 4-tensor M ∈ Mn (Mn (C)) ∼               = Mn (C) ⊗ Mn (C) is said to
be Hermitian if Mi jkl = M jilk for all i, j, k, l ∈ {1, . . . , n}. Investigating the noncommutative Grothendieck
inequality in the setting of Hermitian M is most natural in applications to quantum information, while
problems in real optimization as described in the introduction lead to real M ∈ Mn (Mn (R)). These special
cases are treated in this section.
    Consider the following Hermitian analogue of the quantity OptC (M):
                                                                   n
                                             def
                                  Opt∗C (M) =         sup         ∑ Mi jkl Ai j Bkl .
                                                    A,B∈Hn   i, j,k,l=1
                                                   kAk,kBk⩽1


Note that the convex hull of Un consists of all the matrices A ∈ Mn (C) with kAk ⩽ 1, so by convexity for
every M ∈ Mn (Mn (C)) we have
                                                                   n
                                  OptC (M) =          sup         ∑ Mi jkl Ai j Bkl .                          (3.1)
                                                   A,B∈Mn (C) i, j,k,l=1
                                                   kAk,kBk⩽1


This explains why Opt∗C (M) should indeed be viewed as a Hermitian analogue of OptC (M). The real
analogue of (3.1) is that, due to the fact that the convex hull of On consists of all the matrices A ∈ Mn (R)
with kAk ⩽ 1, for every M ∈ Mn (Mn (R)) we have
                                                                   n
                                  OptR (M) =          sup         ∑ Mi jkl Ai j Bkl .                          (3.2)
                                                   A,B∈Mn (R) i, j,k,l=1
                                                   kAk,kBk⩽1


    The following theorem establishes an algorithmic equivalence between the problems of approximating
either of these two quantities.

Theorem 3.1. For every K ∈ [1, ∞) the following two assertions are equivalent.

    1. There exists a polynomial-time algorithm Alg∗ that takes as input a Hermitian M ∈ Mn (Mn (C))
       and outputs A, B ∈ Hn with max{kAk, kBk} ⩽ 1 and Opt∗C (M) ⩽ K|M(A, B)|.

    2. There exists a polynomial-time algorithm Alg that takes as input M ∈ Mn (Mn (R)) and outputs
       U,V ∈ On such that OptR (M) ⩽ KM(U,V ).
                                                      √
    In Section 3.2 we show that for every K > 2 2 there exists an algorithm Alg∗ as in assertion 1
of Theorem 3.1. Consequently, we obtain the algorithm for computing OptR (M) whose existence was
claimed in Theorem 1.1. The implication 1 =⇒ 2 of Theorem 3.1 is the only part of Theorem 3.1
that will be used in this article; the reverse direction 2 =⇒ 1 is included here for completeness. Both
directions of Theorem 3.1 are proved in Section 3.3.

                      T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                   275
                               A SSAF NAOR , O DED R EGEV, AND T HOMAS V IDICK

3.1    Two-dimensional rounding
In this section we give an algorithmic version
                                        √      of Krivine’s proof [28] that the 2-dimensional real
Grothendieck constant satisfies KG (2) ⩽ 2. The following theorem is implicit in the proof of [28,
Thm. 1].
Theorem 3.2 (Krivine). Let g : R → R be defined by g(x) = sign(cos(x)), and let f : [0, π/2) → R be
given by                        
                            def
                                1                            if 0 ⩽ t ⩽ π4 ,
                      f (t) = 6  π          3 
                                           1 4      π
                                                         3
                                
                                  π 2 −t − 2 π      2 −t      if π4 ⩽ t < π2 .
Extend f to a function defined on all of R by requiring that it is even and f (x + π) = − f (x) for all x ∈ R.
There exists a sequence {b2`+1 }∞`=0 ∈ R such that for every L ∈ N the numbers {b0 , . . . , b2L+1 } can be
                                          N

computed in poly(L) time, ∑`=L+1 |b2`+1 | ⩽ C/L for some universal constant C, ∑∞
                             ∞
                                                                                      `=0 |b2`+1 | = 1, and
                                            √ ∞         1 π
                                                         Z                                 
          ∀x, y ∈ R,        cos(x − y) =     2 ∑ b2`+1       f (2` + 1)x − t g t − (2` + 1)y dt .
                                               `=0     2π −π
   An explicit formula for the sequence {b2`+1 }∞ `=0 can be extracted as follows from the proof of [28,
Thm. 1]. For any ` ⩾ 0, define a2` = 0,
                                       (2` + 1)π        16      1
                                                                                `π
                                                                                   
                  a2`+1 = (−1)` cos                                      − (−1)     ,
                                            4       π 2 (2` + 1)4 2` + 1         4
                            √
                              2(π/4)3
                     b1 =               ,
                                3a1
and for ` > 0,
                                                       1
                                          b2`+1 = −          ∑ ad b 2`+1 .
                                                       a1 d|(2`+1)    d

                                                             d6=1

Then |a2`+1 | = O(1/`4 ), from which one deduces the crude bound |b2`+1 | = O(1/`2 ).
    Figure 2 describes a two-dimensional rounding scheme derived from Theorem 3.2. The following
claim states its correctness in a way that will be useful for us later.
Claim 3.3. Let ε > 0 and for every j, k ∈ {1, . . . , n} let x j , yk ∈ C satisfy |x j | = |yk | = 1. Then the rounding
procedure described in Figure 2 runs in time poly(n, 1/ε) and returns λ j , µk ∈ R with |λ j |, |µk | ⩽ 1 for
every j, k ∈ {1, . . . , n}, and
                                                         1
                                     E λ j µk = √ ℜ (x j yk ) + εhx0j , y0k i ,
                                                
                                                                                                                   (3.3)
                                                          2
where x0j , y0k ∈ L2 (R) are such that kx0j k2 , ky0k k2 ⩽ 1 and ℜ(·) denotes the real part.
Proof. Fix j, k ∈ {1, . . . , n} and let θ j , φk and λ j , µk be as defined in Steps 2 and 4 of the rounding
procedure, respectively. Applying Theorem 3.2,
                  L
                          1 π                                          1
                         Z                                    
      E λ j µk = ∑ b2`+1       f (2` + 1)θ j − t g t − (2` + 1)φk dt = √ cos(θ j − φk ) + η jk ,
                 `=0     2π −π                                          2


                       T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                     276
             E FFICIENT ROUNDING FOR THE N ONCOMMUTATIVE G ROTHENDIECK I NEQUALITY


                                         Two-dimensional rounding procedure

   1. Let ε > 0 and, for j, k ∈ {1, . . . , n} let x j , yk ∈ C with |x j | = |yk | = 1, be given as input. Let f , g,C
      and {b2`+1 }∞
                  `=0 be as in Theorem 3.2.

   2. For every j, k let θ j ∈ [0, 2π) (resp. φk ∈ [0, 2π)) be the angle that x j (resp. yk ) makes with the
      x-axis.

   3. Select t ∈ [−π, π] uniformly at random. Let L = dC/εe and p = 1 − ∑∞             `=L+1 |b2`+1 |. Select
      ` ∈ {−1, 0, . . . , L} with probability Pr(−1) = 1 − p and Pr(`) = |b2`+1 | for ` ∈ {0, . . . , L}.
                                                        def                                                       def
   4. For every j, k, if ` ⩾ 0 then set λ j = sign(b2`+1 ) f ((2` + 1)θ j − t) and µk = g(t − (2` + 1)φk ).
      Otherwise, set λ j = 0, µk = 0.

   5. Return (λ j ) j∈{1,...,n} and (µk )k∈{1,...,n} .



Figure 2: The two-dimensional rounding algorithm takes as input real 2-dimensional unit vectors. It
returns real numbers of absolute value at most 1.

where
                                       ∞
                                                    1
                                                         Z π
                           def                                                                   
                      η jk = −      ∑      b2`+1                f (2` + 1)θ j − t g t − (2` + 1)φk dt.
                                   `=L+1           2π     −π

By definition, cos(θ j − φk ) = ℜ (x j yk ). Using 1 − ε ⩽ p ⩽ 1, which follows from the bound stated in
Theorem 3.2, η jk equals 1 − p ⩽ ε times a weighted average of the product of certain values taken by f
and g, the former only depending on θ j and the latter on φk . Equivalently, this weighted average can be
written as the inner product of two vectors x0j and y0k of norm at most 1. Finally, all steps of the rounding
procedure can be performed in time polynomial in n and 1/ε.

3.2     Rounding in the Hermitian case
Let M ∈ Mn (Mn (C)) be Hermitian, and X,Y ∈ Un (Cd ). For every r ∈ {1, . . . , d} define as usual Xr ,Yr ∈
Mn (C) by (Xr ) jk = (X jk )r and (Yr ) jk = (Y jk )r . Define X 0 ,Y 0 ∈ Mn (C2d ) by
                                                                                       !
                                d        Xp + Xp∗                     Xp − Xp∗
                                                                            
                        0 def
                      X jk = ∑                            e2p−1 + i                 e2p ∈ C2d
                               p=1          2          jk                 2      jk

and                                                                                                 !
                                   d           Yp +Yp∗                     Yp −Yp∗
                                                                                  
                             def
                        Y jk0 = ∑                                e2p−1 + i                    e2p       ∈ C2d ,
                                   p=1            2           jk              2          jk

where e1 , . . . , e2d is the canonical basis of C2d . Then (X 0 )(X 0 )∗ = (X 0 )∗ (X 0 ) = (XX ∗ + X ∗ X)/2 = I, so
X 0 ∈ Un (C2d ) and similarly Y 0 ∈ Un (C2d ). Moreover, since M is Hermitian, |M(X,Y )| = |M(X 0 ,Y 0 )|.

                       T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                         277
                             A SSAF NAOR , O DED R EGEV, AND T HOMAS V IDICK


                                      Hermitian rounding procedure

   1. Let X,Y ∈ Mn (Cd ) and ε > 0 be given as input.

   2. Let A, B ∈ Mn (C) be the unitary matrices returned by the complex rounding procedure described
      in Figure 1. If necessary, multiply A by a complex phase to ensure that M(A, B) is real. Write the
      spectral decompositions of A, B as
                                            n                                n
                                   A = ∑ eiθ j u j u∗j      and     B = ∑ eiφk vk v∗k ,
                                        j=1                              k=1

      where θ j , φk ∈ R and u j , vk ∈ Cn .
                                                                                          def      def
   3. Apply the two-dimensional rounding algorithm from Figure 2 to the vectors x j = eiθ j and yk = eiφk .
      Let λ j , µk be the results.

   4. Output
                                                n                                n
                                      def                              def
                                   A0 = ∑ λ j u j u∗j       and     B0 = ∑ µk vk v∗k .
                                            j=1                              k=1




Figure 3: The Hermitian rounding algorithm takes as input a pair of vector-valued matrices X,Y ∈ Mn (Cd ).
It outputs two Hermitian matrices A0 , B0 ∈ Hn of norm at most 1.


This shows that for the purpose of proving the noncommutative Grothendieck inequality for Hermitian
M we may assume without loss of generality that the “component matrices” of X,Y are Hermitian
themselves. Nevertheless, even in this case the rounding algorithm described in Figure 1 returns unitary
matrices A, B that are not necessarily Hermitian. The following simple argument shows how Krivine’s
two-dimensional rounding scheme √  can be applied on the eigenvalues of A, B to obtain Hermitian matrices
of norm 1, at the loss of a factor 2 in the approximation. A similar argument, albeit not explicitly
algorithmic, also appears in [41, Claim 4.7].

Theorem 3.4. Let n be an integer, M ∈ Mn (Mn (C)) Hermitian, ε ∈ (0, 1) and X,Y ∈ Un (Cd ) such that
                                                    
                                        M X, Y          ⩾ (1 − ε)SDPC (M) .

Then the rounding procedure described in Figure 3 runs in time polynomial in n and 1/ε and outputs a
pair of Hermitian matrices A0 , B0 ∈ Hn with norm at most 1 such that
                            h           i  1      1  
                           E M A0 , B0    ⩾  √ − 1 + √ ε SDPC (M) .
                                            2 2       2

Proof. Let A, B ∈ Mn (C) be as defined as in Step 2 of Figure 3, and assume as in Figure 3 that M(A, B) is
real. By Theorem 1.4 we have E[|M(A, B)|] ⩾ (1/2 − ε) SDPC (M). Hence to conclude it will suffice to

                     T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                            278
             E FFICIENT ROUNDING FOR THE N ONCOMMUTATIVE G ROTHENDIECK I NEQUALITY

show that for any fixed pair of matrices A, B ∈ Mn (C),
                              h            i     1
                            E M(A0 , B0 ) ⩾ √ M(A, B) − ε SDPC (M) ,                                                              (3.4)
                                                  2
where A0 , B0 are as returned by the rounding procedure and the expectation is over the random choices
made in the two-dimensional rounding step, i. e., step 3 of Figure 3. Applying Claim 3.3,
                                      n
      E M(A0 , B0 ) ]                ∑ M(u j u∗j , vk v∗k )E λ j µk
                                                                     
                             ⩾
                                    j,k=1
                            (3.3)   1 n                                                      n
                                    √ ∑ M(u j u∗j , vk v∗k )ℜ ei(θ j −φk )                  ∑ hx0j , y0k i M(u j u∗j , vk v∗k )
                                                                           
                             ⩾                                                −ε
                                     2 j,k=1                                                j,k=1
                                    1
                             =      √ M(A, B) − ε M(W, Z) ,                                                                       (3.5)
                                     2
where in the last inequality, for the first term we used that M(u j u∗j , vk v∗k ) is real since M is Hermitian, and
for the second term we defined the vector-valued matrices
                             n                                                         n
                      def                                                       def
                   W = ∑ x0j u j u∗j ∈ Mn (C2n )              and          Z = ∑ y0k vk v∗k ∈ Mn (C2n ) ,
                            j=1                                                       k=1

where multiplication of the vector x0j (resp. y0k ) with the scalar-valued matrix u j u∗j (resp. vk v∗k ) is taken
entrywise. Here we assumed that the vectors x0j , y0k from Claim 3.3 lie in C2n , which is without loss of
generality since there are 2n of them. One checks that
                                                              n
                                                   WW ∗ = ∑ kx0j k22 u j u∗j .
                                                             j=1

Since kx0j k2 ⩽ 1 for all j ∈ {1, . . . , n}, it follows that kWW ∗ k ⩽ 1. Similarly

                                            max{kW ∗W k, kZZ ∗ k, kZ ∗ Zk} ⩽ 1 .

Applying Lemma 2.2 we obtain R, S ∈ Un (C2n(n+1) ) such that M(W, Z) = M(R, S), hence |M(W, Z)| ⩽
SDPC (M). Equation (3.5) then implies the desired estimate (3.4).

3.3    Proof of Theorem 3.1
In this section we prove Theorem 3.1. We first record for future use the following simple lemma, which
is an algorithmic version of (3.2).
Lemma 3.5. There exists a polynomial-time algorithm that takes as input a 4-tensor M ∈ Mn (Mn (R))
and two matrices A, B ∈ Mn (R) with max{kAk, kBk} ⩽ 1 and outputs two orthogonal matrices U,V ∈ On
such that
                                            n                          n
                                          ∑ Mi jkl Ai j Bkl ⩽ ∑ Mi jklUi jVkl .
                                      i, j,k,l=1                   i, j,k,l=1



                     T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                                      279
                              A SSAF NAOR , O DED R EGEV, AND T HOMAS V IDICK

Proof. Write the singular value decompositions of A, B as A = ∑ni=1 σi ei fi∗ and B = ∑ni=1 τi gi h∗i , where
each of the sequences (ei )ni=1 , ( fi )ni=1 , (gi )ni=1 , (hi )ni=1 ⊆ Rn is orthonormal, and σ , τ ∈ [0, 1]n (since
max{kAk, kBk} ⩽ 1). Now, M(A, B) is given by
                                               n
                                              ∑ σi τ j M(ei fi∗ , g j h∗j ) .                                  (3.6)
                                             i, j=1

Fixing all σi , τ j but, say, σ1 , (3.6) is a linear function of σ1 , and thus we can shift σ1 to either −1 or
1, without decreasing (3.6). Proceeding in this way with the other variables σ2 , . . . , σn , τ1 , . . . , τn , each
one in its turn, we obtain ε, δ ∈ {−1, 1}n such that if we define U,V ∈ On by U = ∑ni=1 εi (ei fi∗ ) and
V = ∑ni=1 δi (gi h∗i ) then M(A, B) ⩽ M(U,V ), as required.

Proof of Theorem 3.1. We first prove the implication 1 =⇒ 2.
   For any A ∈ M2n (C) define A1 , A2 , A3 , A4 ∈ Mn (C) through the block decomposition
                                                           
                                                   A1 A2
                                             A=               .
                                                   A3 A4

Given M ∈ Mn (Mn (R)) let M 0 ∈ M2n (M2n (R)) be such that M 0 (A, B) = M(ℜA2 , ℜB2 ) for every A, B ∈ H2n .
Formally, for every i, j, k, l ∈ {1, . . . , 2n} we have

                              Mi, j−n,k,l−n if (i, k) ∈ {1, . . . , n}2 and ( j, l) ∈ {n + 1, . . . , 2n}2 ,
                        
                        
                                               if ( j, l) ∈ {1, . . . , n}2 and (i, k) ∈ {n + 1, . . . , 2n}2 ,
                        
                         M
                        
               0      1  j,i−n,l,k−n
             Mi jkl =         Mi, j−n,l,k−n if (i, l) ∈ {1, . . . , n}2 and ( j, k) ∈ {n + 1, . . . , 2n}2 ,
                      4     M                if ( j, k) ∈ {1, . . . , n}2 and (i, l) ∈ {n + 1, . . . , 2n}2 ,
                         j,i−n,k,l−n
                        
                        
                              0                otherwise.

Then M 0 is Hermitian. Apply the algorithm Alg∗ (whose existence is the premise of 1 of Theorem 3.1) to
M 0 to get A, B ∈ H2n such that Opt∗C (M) ⩽ K|M(A, B)|. Since A, B ∈ H2n we have A3 = A∗2 and B3 = B∗2 .
Because max{kAk, kBk} ⩽ 1 also max{kℜAk, kℜBk} ⩽ 1. By Lemma 3.5 we can therefore efficiently
find U,V ∈ On such that

               KM(U,V ) ⩾ K|M(ℜA, ℜB)| = K|M 0 (A, B)| ⩾ Opt∗C (M 0 )
                                                                           
                                    0                   0      0 S        0 T
                        = sup      M (C, D) ⩾ sup M                    ,
                           C,D∈H2n           S,T ∈On          S∗ 0        T∗ 0
                              kCk,kDk⩽1

                           = sup M(S, T ) = OptR (M) .
                              S,T ∈On

    To prove the reverse implication 2 =⇒ 1, define ψ : M2n (R) → Hn by
                           
                     A1 A2        def 1                     i
             ψ                    = (A1 + A∗1 + A4 + A∗4 ) + (A2 − A∗2 + A∗3 − A3 ) ∈ Hn .                     (3.7)
                     A3 A4            4                     4

Suppose that M ∈ Mn (Mn (C)) is Hermitian and define M 00 ∈ M2n (M2n (R)) by requiring that M 00 (A, B) =
M (ψ(A), ψ(B)) for every A, B ∈ M2n (R). Note that M(ψ(A), ψ(B)) ∈ R since M, ψ(A), ψ(B) are all

                      T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                   280
             E FFICIENT ROUNDING FOR THE N ONCOMMUTATIVE G ROTHENDIECK I NEQUALITY

Hermitian. Apply the algorithm Alg of part 2) to M 00 , obtaining two orthogonal matrices U,V ∈ O2n
satisfying
                          OptR (M 00 ) ⩽ KM 00 (U,V ) = KM(ψ(U), ψ(V )) .
By Lemma 3.6 below we have max{kψ(U)k, kψ(V )k} ⩽ 1. Moreover, using Lemma 3.7 below we have
             OptR (M 00 ) =      sup        M(ψ(A), ψ(B))
                              A,B∈M2n (R)
                              kAk,kBk⩽1
                                                                                                 (3.8)
                                                  ℜX           ℑX                   ℜY    ℑY
                         ⩾       sup        M ψ                           ,ψ                         ,
                               X,Y ∈Hn            −ℑX          ℜX                   −ℑY   ℑY
                              kXk,kY k⩽1

where ℜ(·) and ℑ(·) denote the real and imaginary parts, respectively. Observe that for every X ∈ Hn we
have                                                   
                                            ℜX ℑX
                                     ψ                       =X.
                                           −ℑX ℜX
Consequently the rightmost term in (3.8) equals Opt∗C (M). Therefore Opt∗C (M) ⩽ K|M(ψ(U), ψ(V ))|,
so that the algorithm that outputs ψ(U), ψ(V ) has the desired approximation factor.
Lemma 3.6. Let ψ : M2n (R) → Hn be given as in (3.7). Then kψ(Y )k ⩽ kY k for all Y ∈ M2n (R).
Proof. For Y ∈ M2n (R) write Z = (Y +Y ∗ )/2. Setting
                                                     
                                                Z1 Z2
                                        Z=              ,
                                                Z3 Z4
where Z1 , Z2 , Z3 , Z4 ∈ Mn (R), we then have ψ(Y ) = (Z1 + Z4 )/2 + i(Z2 − Z3 )/2. Take z ∈ Cn and write
z = x + iy, where x, y ∈ Rn . Then
                     hZ1 x, xi + hZ4 y, yi − hZ2 y, xi − hZ3 x, yi hZ1 y, yi + hZ4 x, xi + hZ2 x, yi + hZ3 y, xi
   ℜ hψ(Y )z, zi =                                                 +
                                            2                                             2
                                                    
                         1       x        x         1       y      y
                      =     Z        ,          +       Z       ,     .
                         2     −y       −y          2       x      x
Since ψ(Y ) is Hermitian, it follows that kψ(Y )k ⩽ kZk ⩽ kY k.
                                                        
                                              ℜX ℑX
Lemma 3.7. For every X ∈ Hn we have                          = kXk.
                                             −ℑX ℜX
Proof. Write                                                  
                                                    ℜX    ℑX
                                           Z=                      ∈ M2n (R) .
                                                    −ℑX   ℜX
Since X is Hermitian, Z is symmetric. For every a, b ∈ Rn ,
                 
                   a      a
               Z       ,       = h(ℜX)a, ai + h(ℑX)b, ai − h(ℑX)a, bi + h(ℜX)b, bi
                   b      b
                               = ℜ (hX(a − ib), a − ibi) .
Since Z is symmetric and X is Hermitian, it follows that kZk = kXk.

                     T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                281
                             A SSAF NAOR , O DED R EGEV, AND T HOMAS V IDICK


                                         Real rounding procedure

    1. Let X,Y ∈ Mn (Rd ) be given as input, and let ε ∈ {−1, 1}d be chosen uniformly at random.
              def
    2. Set Yε = hε,Y i. Write the singular value decomposition of Yε as Yε = ∑ni=1 ti (ε) ui (ε)vi (ε)∗ , where
       ti (ε) ∈ [0, ∞) and ui (ε), vi (ε) ∈ Rn . Define
                                                    n
                                             def
                                       (Yε )τ = ∑ min{ti (ε), τ}ui (ε)vi (ε)∗ ,
                                                   i=1

                def √
       where τ =     3/2.

    3. Let X(ε) ∈ Mn (R) of norm at most 1 be such that

                                     M(X(ε), (Yε )τ ) = max |M(X, (Yε )τ )| .
                                                         X∈Mn (R)
                                                          kXk⩽1


    4. Output the pair A = X(ε) and B = τ1 (Yε )τ .



Figure 4: The real rounding algorithm takes as input X,Y ∈ Mn (Rd ). It outputs two real matrices
A, B ∈ Mn (R) of norm at most 1.


4    Direct rounding in the real case
We now describe and analyze a different rounding procedure for the real case of the noncommutative
Grothendieck inequality. The argument is based on the proof of the noncommutative Grothendieck
inequality due to Kaijser [25], which itself has a structure similar to the proof of the classical Grothendieck
inequality due to Krivine [27] (see also [22], the “Notes and Remarks” section of Chapter 1 of [9], and
the survey article [24]), and uses ideas from [36] to extend that proof to the non-commutative setting.

Theorem 4.1. Given n ∈ N, M ∈ Mn (Mn (R)) and η ∈ (0, 1/2), suppose that X,Y ∈ On (Rd ) are such
that
                                       
                               M X, Y ⩾ (1 − η)SDPR (M) ,                                   (4.1)

where SDPR (M) is defined in (1.12). Then the rounding procedure described in Figure 4 runs in time
polynomial in n and outputs a pair of real matrices A, B ∈ Mn (R) with norm at most 1 such that
                                   h        i   (1 − 2η)2
                                  E M A, B     ⩾     √     SDPR (M) .                                    (4.2)
                                                    3 3

   Note that by Lemma 3.5 we can also efficiently convert the matrices A, B to orthogonal matrices
U,V ∈ On without changing the approximation guarantee.

                     T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                              282
            E FFICIENT ROUNDING FOR THE N ONCOMMUTATIVE G ROTHENDIECK I NEQUALITY

    The proof of Theorem 4.1 relies on two claims that are used to bound the error that results from the
truncation step (step 2) in the rounding procedure in Figure 4. The first claim plays the same role as
Claim 2.1 did in the complex case.
Claim 4.2. Fix X ∈ Mn (Rd ) and let ε ∈ {−1, 1}d be chosen uniformly at random. Set Xε = hε, Xi. Then
                              Eε (Xε Xε∗ )2 ⩽ (XX ∗ )2 + 2kX ∗ XkXX ∗ ,
                                          

                              Eε (Xε∗ Xε )2 ⩽ (X ∗ X)2 + 2kXX ∗ kX ∗ X .
                                          

Proof. Define the symmetric real vector-valued matrix Z by
                                                       
                                            def   0 X
                                          Z=              .
                                                 X∗ 0
Following the proof of Lemma 1.1 in [36] (which establishes a bound analogous to the one proved in
Claim 2.4 for the case of i. i. d. {±1} random variables) and defining as usual Zr ∈ Mn (R) by (Zr )i j =
(Zi j )r for every r ∈ {1, . . . , d}, we have
                                     d
                        Eε hε, Zi4 = ∑ Zr4 +                                    Zr2 Zs2 + Zr Zs Zr Zs + Zr Zs2 Zr
                                                                                                                  
                                                                  ∑
                                             r=1             r,s∈{1,...,d}
                                                                 r6=s
                                                                                                                                 (4.3)
                                              d             2                                     2
                                       =         ∑     Zr2        +         ∑           Zr Zs + Zs Zr .
                                                 r=1                  r,s∈{1,...,d}
                                                                          r<s

Using the inequality (A + B)(A + B)∗ ⩽ 2(AA∗ + BB∗ ), which follows from (A − B)(A − B)∗ ⩾ 0 for all
A, B ∈ Mn (R), we can bound the second sum in (4.3) as follows.
                                            2          d                                     d                         
                ∑           Zr Zs + Zs Zr        ⩽2         ∑ Zr             ∑          Zs2 Zr + ∑ Zs        ∑          Zr2 Zs
            r,s∈{1,...,d}                                r=1              s∈{1,...,d}           s=1       r∈{1,...,d}
                r<s                                                          s>r                             r<s
                                                        d                             
                                                 = 2 ∑ Zr                   ∑       Zs2 Zr
                                                       r=1            s∈{1,...,d}
                                                                         s6=r
                                                        d          d      
                                                                         2
                                                 ⩽ 2 ∑ Zr             ∑ s Zr .
                                                                       Z
                                                       r=1            s=1

Replacing Zr by its definition and using ABA∗ ⩽ kBkAA∗ , which holds true for every positive semidefinite
B ∈ Mn (R), we arrive at the following matrix inequality:
                                  Eε (Xε Xε† )2
                                                               
                          4
                                                       0
                Eε hε, Zi =
                                                   Eε (Xε† Xε )2
                                                                
                                        0

                                  (XX † )2
                                                            †
                                                              kX Xk XX †
                                                                                     
                                                 0                            0
                               ⩽                       + 2                               .
                                      0     (X † X)2                0    kXX † k X † X
The inequality above implies a separate matrix inequality for both diagonal blocks, proving the claim.

                      T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                                    283
                                A SSAF NAOR , O DED R EGEV, AND T HOMAS V IDICK

    The second claim appears as Lemma 2.3 in [25] (in the complex case). We include a short proof for
the sake of completeness.

Claim 4.3. Let Y ∈ Mn (R). For any τ > 0, there exists a decomposition Y = Yτ +Y τ such that kYτ k∞ ⩽ τ
and
                                   1                                      1
                   Y τ (Y τ )∗ ⩽     2
                                       (YY ∗ )2  and       (Y τ )∗Y τ ⩽       (Y ∗Y )2 .
                                 (4τ)                                   (4τ)2
Proof. Define Yτ by “truncating” the singular values of Y at τ, as is done in step 2 of the rounding
procedure described in Figure 4, so that kYτ k ⩽ τ. Define Y τ = Y −Yτ . By definition, Y , Yτ and Y τ all
have the same singular vectors, and the non-zero singular values of Y τ are of the form µ − τ, where µ ⩾ τ
is a singular value of Y . Using the bound |µ − τ| ⩽ µ 2 /(4τ) valid for any µ ⩾ τ, any nonzero eigenvalue
λ = (µ − τ)2 of Y τ (Y τ )∗ (resp. (Y τ )∗Y τ ) satisfies λ ⩽ µ 4 /(4τ)2 , which proves the claim.

Proof of Theorem 4.1. We shall continue using the notation introduced in Figure 4, Claim 4.2 and
Claim 4.3. Let X,Y ∈ On (Rd ) satisfying (4.1). For every ε ∈ {−1, 1}d and any τ > 0 let

                                               Yε = hε,Y i = (Yε )τ +Yετ

be the decomposition promised by Claim 4.3. Combining the bound from Claim 4.2 with the one from
Claim 4.3, we see that
                                                  1                                  3
                         Eε Yετ (Yετ )∗ ⩽             kYY ∗ k2 + 2kY ∗Y kkYY ∗ k =
                                                                              
                                                    2
                                                                                         ,                                (4.4)
                                                (4τ)                               (4τ)2

where the final step of (4.4) follows from Y ∈ On (Rd ), and the same bound holds on kEε (Yετ )∗Yετ k. We
                                                                                                  

have
                                                                                 
                 M X, Y = Eε M(Xε , Yε ) ⩽ Eε M(Xε , (Yε )τ ) + Eε M(Xε , Yετ )
                                                     √
                                                      3                                            (4.5)
                             ⩽ Eε M(Xε , (Yε )τ ) +       SDPR (M) ,
                                                      4τ
where the last inequality in (4.5) follows from the definition of SDPR (M) and (4.4).
   To bound the first term in (4.5), let
                            n                                                  n
                     Xε = ∑ si (ε)wi (ε)zi (ε)∗          and       (Yε )τ = ∑ ti0 (ε)ui (ε)vi (ε)∗ ,
                           i=1                                                i=1

where si (ε),ti0 (ε) ⩾ 0 and ui (ε), vi (ε), wi (ε), zi (ε) ∈ Rn , be the singular value decompositions of Xε , (Yε )τ ,
                            def
respectively. Set Mi, j (ε) = M(wi (ε)zi (ε)∗ , u j (ε)v j (ε)∗ ). With these definitions,
                                  h     n                      i      h n         n                  i
                                                          0                                    0
    Eε [M(Xε , (Yε )τ )] = Eε          ∑ Mi, j (ε)si (ε)t j (ε)  ⩽ Eε  ∑  s i (ε) ∑ Mi, j (ε)t j (ε)
                                      i, j=1                            i=1           j=1
                            h n         n                   i1/2  h n                    n                 i1/2
                          ⩽ Eε ∑ si (ε)2 ∑ Mi, j (ε)t 0j (ε)        Eε ∑                ∑ Mi, j (ε)t 0j (ε)           .   (4.6)
                                      i=1         j=1                               i=1 j=1



                      T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                             284
             E FFICIENT ROUNDING FOR THE N ONCOMMUTATIVE G ROTHENDIECK I NEQUALITY

Note that the rightmost term in (4.6) is at most
                                            h            i1/2
                                         τEε M(X(ε), B(ε))      ,

where B(ε) = (1/τ)(Yε )τ and X(ε) is as defined in the rounding procedure in Figure 4. To bound the
leftmost term in (4.6), define
                                                     n
                                              def
                                          Wε = ∑ (ri (ε)si (ε)2 )wi (ε)zi (ε)∗ ,
                                                i=1
                             n               0
where ri (ε) is the sign of ∑ j=1 Mi, j (ε)t j (ε), so that

                                     n              n
                                    ∑ si (ε)2 ∑ Mi, j (ε)t 0j (ε) = M(Wε , (Yε )τ ) .
                                    i=1             j=1

Moreover, by definition Wε Wε∗ = (Xε Xε∗ )2 , so using Claim 4.2 we have

                  Eε Wε∗Wε = Eε (Xε∗ Xε )2 ⩽ kX ∗ Xk2 + 2kXX ∗ kkX ∗ Xk = 3 ,
                                                

and the same bound holds for kEε Wε Wε∗ k. By the definition of (Yε )τ we also have
                                          

                               n                                         o
                         max Eε [(Yε )∗τ (Yε )τ ] , Eε [(Yε )τ (Yε )∗τ ]   ⩽ τ2 .

Hence
                 h n           n                     i        √     h  W (Y ) i √
                  ∑ si (ε)2 ∑ Mi, j (ε)t 0j (ε)
                                                                         ε  ε τ
            Eε                                            =    3τ Eε M √ ,       ⩽ 3τ SDPR (M) ,
                  i=1         j=1                                        3  τ

where we used the definition of SDPR (M). Finally, combining (4.5) and (4.6) with the bounds shown
above we obtain
                                                                               √
                       (4.1)          √4
                                            p                                    3
    (1 − η)SDPR (M) ⩽ M X, Y ⩽ 3τ SDPR (M) · Eε [M(X(ε), B(ε))] +                  SDPR (M) . (4.7)
                                                                               4τ
             √
Setting τ = 3/2 in (4.7) and simplifying leads to the desired bound (4.2).
     All steps in the rounding procedure can be performed efficiently. The calculation of X(ε) in the
third step of Figure 4 can be expressed as a semidefinite program and solved in time polynomial in n.
Alternatively, one may directly compute X(ε) as follows. Write the polar decomposition

                                          Tr2 (M ∗ (I ⊗ (Yε )τ )) = QP ∈ Mn (R) ,

where the partial trace is taken with respect to the second tensor, Q is an orthogonal matrix and P is
positive semidefinite. The optimal choice of X(ε) corresponds to the real matrix of norm at most 1
maximizing

               |Tr(X(ε) · QP)| = Tr X(ε) · Tr2 (M ∗ (I ⊗ (Yε )τ )) = |M(X(ε), (Yε )τ )| ,
                                                                  

and this is achieved by taking X(ε) = Q∗ .

                        T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                   285
                                  A SSAF NAOR , O DED R EGEV, AND T HOMAS V IDICK

5    Some applications
Before presenting the details of our applications of Theorem 1.1, we observe that the problem of computing
OptR (M) is a rather versatile optimization problem, perhaps more so than what one might initially guess
from its definition. The main observation is that by considering matrices M which only act non-trivially
on certain diagonal blocks of the two variables U,V that appear in the definition of OptR (M), these
variables can each be thought of as a sequence of multiple matrix variables, possibly of different shapes
but all with operator norm at most 1. This allows for some flexibility in adapting the noncommutative
Grothendieck optimization problem to concrete settings, and we explain the transformation in detail next.
    For every n, m ⩾ 1, let Mm,n (R) be the vector space of real m × n matrices. Given integers k, ` ⩾ 1
and sequences of integers (mi ), (ni ) ∈ Nk , (p j ), (q j ) ∈ N` , we define BilR (k, `; (mi ), (ni ), (p j ), (q j )), or
simply BilR (k, `) when the remaining sequences are clear from context, as the set of all
                                             k
                                            M                  `
                                                              M              
                                       f:         Mmi ,ni (R) ×   Mp j ,q j (R) → R
                                            i=1                              j=1

that are linear in both arguments. Concretely, f ∈ BilR (k, `) if and only if there exists real coefficients
αirs, juv such that for every (Ai ) ∈ ki=1 Mmi ,ni (R) and (B j ) ∈ `j=1 Mp j ,q j (R),
                                     L                             L

                                                                   k     `    mi      ni       pj   qj
                                                           
                  f (Ai )i∈{1,...,k} , (B j ) j∈{1,...,`} = ∑ ∑ ∑ ∑ ∑ ∑ αirs, juv (Ai )rs (B j )uv .                  (5.1)
                                                                 i=1 j=1 r=1 s=1 u=1 v=1

For integers m, n ⩾ 1, let Om,n ⊂ Mm,n (R) denote the set of all m × n real matrices U such that UU ∗ = I if
m ⩽ n and U ∗U = I if m ⩾ n. If m = n then On,n = On is the set of orthogonal matrices; On,1 is the set of
all n-dimensional unit vectors; O1,1 is simply the set {−1, 1}. Given f ∈ BilR (k, `), consider the quantity
                                                     def                                            
                                        OptR ( f ) =               sup               f (Ui ), (V j ) .
                                                           (Ui )∈ ki=1 Omi ,ni
                                                                  L

                                                           (V j )∈ `j=1 O p j ,q j
                                                                  L



Note that this definition coincides with the definition of OptR ( f ) given in the introduction whenever
f ∈ BilR (1, 1; n, n, n, n). The proof of the following proposition shows that the new optimization problem
still belongs the framework of the noncommutative Grothendieck problem.
Proposition 5.1. There exists a polynomial-time algorithm that takes as input k, ` ∈ N, (mi ), (ni ) ∈
Nk , (p j ), (q j ) ∈ N` and f ∈ BilR (k, `; (mi ), (ni ), (p j ), (q j )) and outputs (Ui ) ∈ ki=1 Omi ,ni and (V j ) ∈
                                                                                              L

   j=1 O p j ,q j such that
L`
                                                                                
                                        OptR ( f ) ⩽ O(1) · f (Ui ), (V j ) .
                                                                                                            √
Moreover, the implied constant in the O(1) term can be taken to be any number larger than 2 2.
Proof. Let k, ` ∈ N, (mi ), (ni ) ∈ Nk , (p j ), (q j ) ∈ N` be given, and define
                                   k                       k                               `                    `
                            def                     def                         def                       def
                         m = ∑ mi ,               n = ∑ ni ,                 p = ∑ p j,                  q = ∑ qj ,
                                  i=1                     i=1                         j=1                       j=1


                       T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                        286
              E FFICIENT ROUNDING FOR THE N ONCOMMUTATIVE G ROTHENDIECK I NEQUALITY

      def
and t = max{m, n, p, q}. We first describe how ki=1 Mmi ,ni (R) (and `j=1 Mp j ,q j (R), respectively) can
                                                               L                                L

be identified with a subset of Mt (R) consisting of block-diagonal matrices. For any i ∈ {1, . . . , k} and
                                                  i ∈ M (R) be the matrix that has all entries equal to 0 except the entry
r ∈ {1, . . . , mi }, s ∈ {1, . . . , ni }, let Fr,s   t
in position (r + ∑ j<i m j , s + ∑ j<i n j ), which equals 1. Similarly, for any j ∈ {1, . . . , `} and u ∈ {1, . . . , p j },
                                   j
v ∈ {1, . . . , q j } we let Gu,v      ∈ Mt (R) be the matrix that has all entries equal to 0 except the entry in
position (u + ∑i< j pi , v + ∑i< j qi ), which equals 1. Define linear maps Φ : ki=1 Mmi ,ni (R) → Mt (R) and
                                                                                          L
    L`
Ψ : j=1 Mp j ,q j (R) → Mt (R) by

                      def k mi ni       i
                                                                                      def ` p j q j     j
              Φ (Ai ) = ∑ ∑ ∑ (Ai )r,s Fr,s                    and           Ψ (B j ) = ∑ ∑ ∑ (B j )u,v Gu,v .
                            i=1 r=1 s=1                                                         j=1 u=1 v=1

From the definition, one verifies that

                      kΦ((Ai ))k = max kAi k                     and         kΨ((B j ))k = max kB j k .                (5.2)
                                        i∈{1,...,k}                                                 j∈{1,...,`}

Let f ∈ BilR (k, `) and let (αirs, juv ) be real coefficients as in (5.1). Define M ∈ Mt (Mt (R)) by
                                                          i     j      def
                                                      M(Fr,s , Gu,v ) = αirs, juv ,
                def                                                          i or G j , respectively. (Recall
and M(A, B) = 0 if A or B are orthogonal to the linear span of the Fr,s            u,v
that the notation M(A, B) was introduced at the beginning of Section 2.) With this definition, for any
(Ai ) ∈ ki=1 Mmi ,ni (R) and (B j ) ∈ `j=1 Mp j ,q j (R) we have
       L                             L

                                                                          
                                    M Φ((Ai )), Ψ((B j )) = f (Ai ), (B j ) .

    We claim that OptR ( f ) = OptR (M), where OptR (M) is as defined in the introduction. Indeed,
by (5.2) for any (Ui ) ∈ ki=1 Omi ,ni and (V j ) ∈ `j=1 O p j ,q j we have kΦ((Ui ))k, kΨ((V j ))k ⩽ 1, hence
                        L                         L

OptR ( f ) ⩽ OptR (M). Conversely, let U,V ∈ Ot be arbitrary, and U 0 ,V 0 their orthogonal projections on
Im(Φ) and Im(Ψ), respectively. Then M(U 0 ,V 0 ) = M(U,V ). Moreover, if
                                        k                                                 `
                             (Ui0 ) ∈                                         (V j0 ) ∈
                                        M                                                 M
                                               Mmi ,ni (R)        and                           Mp j ,q j (R)
                                         i=1                                              j=1

are such that Φ((Ui0 )) = U 0 and Ψ((V j0 )) = V 0 then by (5.2) maxi∈{1,...,k} kUi0 k = kU 0 k ⩽ kUk = 1, and
similarly max j∈{1,...,`} kV j0 k ⩽ 1. As in the proof of Lemma 3.5, we may then argue that there exists
Ui ∈ Omi ,ni , V j ∈ O p j ,q j such that

                              f ((Ui ), (V j )) ⩾ f ((Ui0 ), (V j0 )) = M(U 0 ,V 0 ) = M(U,V ) ,

proving the reverse inequality OptR ( f ) ⩾ OptR (M).
    To conclude the proof of Proposition 5.1 it remains to note that the algorithm of Theorem 1.1, when
applied to M, produces in polynomial time U,V ∈ Ot such that OptR (M) ⩽ O(1) · M(U,V ). Arguing as
in Lemma 3.5, (Ui ) and (V j ) can be computed from U,V in polynomial time and constitute the output of
the algorithm.

                       T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                          287
                              A SSAF NAOR , O DED R EGEV, AND T HOMAS V IDICK

5.1   Constant-factor algorithm for robust PCA problems
We start with the R1-PCA problem, as described in (1.3). Let a1 , . . . , aN ∈ Rn be given, and define
f ∈ BilR (1, N; (K), (n), (1, . . . , 1), (K, . . . , K)) by

                                                      def N K      n            
                              f Y, (Z1 , . . . , ZN ) = ∑ ∑ (Zi )1k ∑ Yk j (ai ) j .
                                                                 i=1 k=1                j=1

The condition Zi ∈ O1,K is equivalent to Zi being a unit vector, while Y ∈ OK,n is equivalent to the K
rows of Y being orthonormal. Using that the `2 norm of a vector u ∈ RK is equal to maxv∈SK−1 hu, vi, the
quantity appearing in (1.3) is equal to
                                                                                        
                                                   sup           f Y, (Z1 , . . . , ZN ) ,
                                                  Y ∈OK,n
                                              Z1 ,...,ZN ∈O1,K


which by definition is equal to OptR ( f ). An approximation algorithm for the R1-PCA problem then
follows immediately from Proposition 5.1. The algorithm for L1-PCA follows similarly. Letting
g ∈ BilR (1, NK; (K), (n), (1, . . . , 1), (1, . . . , 1)) be defined as

                                                          N K     n            
                             g Y, (Z1,1 , . . . , ZN,K ) = ∑ ∑ Zik ∑ Yk j (ai ) j ,
                                                                    i=1 k=1              j=1


and using that Zik ∈ O1,1 is equivalent to Zik ∈ {−1, 1}, the quantity appearing in (1.4) is equal to
                                                                                 
                                        sup           g Y, (Z1,1 , . . . , ZN,K ) = OptR (g),
                                     Y ∈OK,n
                               Z1,1 ,...,ZN,K ∈O1,1


which again fits into the framework of Proposition 5.1.

5.2   A constant-factor algorithm for the orthogonal Procrustes problem
The generalized orthogonal Procustes problem was described in Section (1.1.3). Continuing with the
notation introduced there, let A1 , . . . , AK be d × n real matrices such that the i-th column of Ak is the
vector xik ∈ Rd . Our goal is then to efficiently approximate

                                                           K               2                      K
                                  def
                  P(A1 , . . . AK ) =      max
                                        U1 ,...UK ∈Od
                                                          ∑ Uk Ak          2
                                                                               =      max        ∑ hUk Ak ,Ul Al i ,
                                                                                   U1 ,...UK ∈Od k,l=1
                                                                                                                       (5.3)
                                                          k=1

                     def
where we set hA, Bi = ∑di=1 ∑nj=1 Ai j Bi j for every two d × n real matrices A, B. We first observe that (5.3)
is equal to
                                                        D K         K      E
                                   max           max     ∑ k k∑ l l .
                                                            U
                                    U1 ,...UK ∈Od V1 ,...VK ∈Od
                                                               A  ,   V A                                 (5.4)
                                                                      k=1              l=1


                     T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                           288
              E FFICIENT ROUNDING FOR THE N ONCOMMUTATIVE G ROTHENDIECK I NEQUALITY

It is clear that (5.4) is at least as large as (5.3). For the other direction, using Cauchy-Schwarz,
                               D K            K          E        K               K
                                  ∑ Uk Ak , ∑ Vl Al ⩽ ∑ Uk Ak 2 · ∑ Vl Al 2 ,
                                  k=1        l=1                  k=1             l=1

so either U1 , . . . ,UK or V1 , . . . ,VK achieve a value in (5.3) that is at least as large as (5.4). The desired
algorithm now follows by noting that (5.4) falls into the framework of Proposition 5.1: it suffices to define
f ∈ BilR (K, K; (d, . . . , d), (d, . . . , d), (d, . . . , d), (d, . . . , d)) by
                                                                    D K        K      E
                             f (U1 , . . . ,UK ), (V1 , . . . ,VK ) = ∑ Uk Ak , ∑ Vl Al .
                                                                         k=1          l=1

Finally, from an assignment to (5.4) we can efficiently extract an assignment to (5.3) achieving at least as
high a value by choosing the one among the (Uk ) or the (Vl ) that leads to a higher value.

Comparison with previous work. To compare the approximation guarantee that we obtained with
the literature, we first note that P(A1 , . . . , AK ) is attained at U1 , . . . ,UK ∈ Od if and only if U1 , . . . ,UK are
maximizers of the following quantity over all V1 , . . . ,VK ∈ Od :

                                                      ∑       hVk Ak ,Vl Al i .                                      (5.5)
                                                  k,l∈{1,...,K}
                                                      k6=l

Indeed, the diagonal term hVk Ak ,Vk Ak i = kVk Ak k22 equals kAk k22 , since Vk is orthogonal. Consequently,
the quantity appearing in (5.5) differs from the quantity defining P(A1 , . . . , AK ) by an additive term that
does not depend on V1 , . . . ,VK . In the same vein, as already mentioned in Section (1.1.3), P(A1 , . . . , AK )
is attained at U1 , . . . ,UK ∈ Od if and only if U1 , . . . ,UK are minimizers of the following quantity over all
V1 , . . . ,VK ∈ Od :
                                                     K
                                                    ∑ kVk Ak −Vl Al k22 .                                            (5.6)
                                                   k,l=1
    While the optimization problems appearing in (5.3), (5.5) and (5.6) have the same exact solutions,
this is no longer the case when it comes to approximation algorithms. To the best of our knowledge
the polynomial-time approximability of the minimization of the quantity appearing in (5.6) was not
investigated in the literature. Nemirovski [35] and So [43] studied the polynomial-time approximability
of the maximization of the quantity appearing in (5.5): the best known algorithm for this problem [43] has
approximation factor O(log(n + d + K)). This immediately translates to the same approximation factor
for computing P(A1 , . . . , AK ), which was the previously best known algorithm for this problem. Our
constant-factor approximation algorithm for P(A1 , . . . , AK ) yields an approximation for the maximization
of the quantity appearing in (5.5) that has a better approximation factor than that of [43] unless the
additive difference ∑Kk=1 kAk k22 is too large. Precisely, our algorithm becomes applicable in this context as
long as this term is at most a factor (1 − 1/C) smaller than P(A1 , . . . , AK ), where C is the approximation
constant from Proposition 5.1. This will be the case for typical applications in which one may think
of each Ak as obtained from a common A by applying a small perturbation followed by an arbitrary
rotation: in that case it is reasonable to expect the optimal solution to satisfy hUk Ak ,Ul Al i ≈ kAk2 for
every l, k ∈ {1, . . . , K}; see, e. g., [35, Sec. 4.3].

                       T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                       289
                               A SSAF NAOR , O DED R EGEV, AND T HOMAS V IDICK

5.3   An algorithmic noncommutative dense regularity lemma
Our goal here is to prove Theorem 1.3, but before doing so we note that it leads to a PTAS for computing
OptC (M) whenever OptC (M) ⩾ κnkMk2 for some constant κ > 0 (we shall use below the notation
introduced in the statement of Theorem 1.3).
     The idea for this is exactly as in Section 3 of [12]. The main point is that given such an M and ε ∈ (0, 1)
the decomposition in Theorem 1.3 only involves T = O(1/(κε)2 ) terms, which can be computed in
polynomial time. Given such a decomposition, we will exhaustively search over a suitably discretized set
of values at , bt for Tr(At X) and Tr(Bt Y ), respectively. For each such choice of values, verifying whether
it is achievable using an X and Y of operator norm at most 1 can be cast as a semidefinite program. The
                                                                            T
final approximation to OptC (M) is given by the maximum value of | ∑t=1         αt at bt |, among those sequences
(at ), (bt ) that were determined to be feasible.
     In slightly more detail, first note that for any X,Y of operator norm at most 1 the values of Tr(At X)
and Tr(Bt Y ) lie in the complex disc with radius n. Given our assumption on OptC (M), the bound on
αt stated in Theorem 1.3 implies |αt | = O(OptC (M)/(κn2 )). Hence an approximation of each Tr(At X)
and Tr(Bt Y ) up to additive error εκn/T will translate to an additive approximation error to M(X,Y ) of
O(εOptC (M)). As a result, to obtain a multiplicative (1 ± ε) approximation to OptC (M) it will suffice to
exhaustively enumerate among (O((n · T /(εκn))2 ))2T possible values for the sequences (at ), (bt ).
     Finally, to decide whether a given sequence of values can be achieved, it suffices to decide two
independent feasibility problems of the following form: given n × n complex matrices (At )t=1       T of norm at
                   T       T
most 1 and (at )t=1 ∈ C , does there exist X ∈ Mn (C) of norm at most 1 such that

                                                                                      εκn
                        max max{|Re(Tr(At X) − at )|, |Im(Tr(At X) − at )|} ⩽             ?
                     t∈{1,...,T }                                                      T

This problem can be cast as a semidefinite program, and feasibility can be decided in time that is
polynomial in n and T .

                                                                       τ−1
Proof of Theorem 1.3. The argument is iterative. Assume that {At , Bt }t=1 ⊆ Un have already been
                         def
constructed. Write M1 = M and
                                                           τ−1
                                             def
                                         Mτ = M − ∑ αt (At ⊗ Bt ) .
                                                           t=1

If OptC (Mτ ) ⩽ εOptC (M) then we may stop the construction. Otherwise, by Theorem 1.1 and multiplying
by an appropriate complex phase if necessary, we can find Aτ , Bτ ∈ Un such that
                                                          1         ε
                                     Mτ (Aτ , Bτ ) ⩾             −     OptC (Mτ ) ,                        (5.7)
                                                             2       2

with the left-hand side of (5.7) real. Set

                                                   def    Mτ (Aτ , Bτ )
                                             ατ =                          ,
                                                         kAτ k22 · kBτ k22

                     T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                                290
            E FFICIENT ROUNDING FOR THE N ONCOMMUTATIVE G ROTHENDIECK I NEQUALITY

                 def
and define Mτ+1 = Mτ − ατ (Aτ ⊗ Bτ ). By expanding the square we have

                                          2           2   Mτ (Aτ , Bτ )2
                                 Mτ+1 2 = Mτ 2 −
                                                          kAτ k22 kBτ k22
                                                      2(1 − ε)2
                                              ⩽ Mτ 2 −           OptC (Mτ )2
                                                           4n2
                                                   2   ε 2 (1 − ε)2
                                              ⩽ Mτ 2 −              OptC (M)2 .                     (5.8)
                                                            4n2
It follows from (5.8) that as long as this process continues, kMτ k22 decreases by an additive term of
at least ε 2 (1 − ε)2 OptC (M)2 /(4n2 ) at each step. This process must therefore terminate after at most
T ⩽ 4n2 kMk22 /(ε(1 − ε)OptC (M))2 steps. It also follows that kMτ k2 ⩽ kMk2 for every τ, so that
                                        kMτ k2 kAτ k2 kBτ k2 kMτ k2 kMk2
                              |ατ | ⩽                       =      ⩽     ,
                                          kAτ k22 · kBτ k22    n     n
where the first inequality follows from Cauchy-Schwarz, and the equality uses that Aτ , Bτ ∈ Un .
    The “moreover” part of Theorem 1.3 follows immediately by using the part of Theorem 1.1 pertaining
to OptR (M) (note that the specific approximation factor does not matter here).

Acknowledgements. We thank Daniel Dadush and Raghu Meka for useful discussions.


References
 [1] N OGA A LON , KONSTANTIN M AKARYCHEV, Y URY M AKARYCHEV, AND A SSAF NAOR:
     Quadratic forms on graphs. Invent. Math., 163(3):499–522, 2006. Preliminary version in STOC’05.
     [doi:10.1007/s00222-005-0465-9] 265
 [2] N OGA A LON AND A SSAF NAOR: Approximating the cut-norm via Grothendieck’s in-
     equality.   SIAM J. Comput., 35(4):787–803, 2006.       Preliminary version in STOC’04.
     [doi:10.1137/S0097539704441629] 260, 261, 262, 263, 264
 [3] J OS M. F. TEN B ERGE: Orthogonal Procrustes rotation for two or more matrices. Psychometrika,
     42(2):267–276, 1977. [doi:10.1007/BF02294053] 262
 [4] P H . B OURGEOIS: Extension de la méthode Procuste à la comparaison de nuages de points situés
     dans des espaces de dimensions différentes. RAIRO Rech. Opér., 16(1):45–63, 1982. Found at
     EUDML. 262
 [5] M ARK B RAVERMAN , KONSTANTIN M AKARYCHEV, Y URY M AKARYCHEV, AND A SSAF NAOR:
     The Grothendieck constant is strictly smaller than Krivine’s bound. Forum of Mathematics, Pi,
     1(e4), 2013. Preliminary version in FOCS’11. [doi:10.1017/fmp.2013.4] 264
 [6] E DUARDO R. C AIANIELLO AND R ENATO M. C APOCELLI: On form and language: the Procrustes
     algorithm for feature extraction. Kybernetik, 8(6):223–233, 1971. [doi:10.1007/BF00288751] 262

                       T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                      291
                           A SSAF NAOR , O DED R EGEV, AND T HOMAS V IDICK

 [7] R ICHARD C LEVE , P ETER H ØYER , B EN T ONER , AND J OHN WATROUS: Consequences and limits
     of nonlocal strategies. In Proc. 19th IEEE Conf. on Computational Complexity (CCC’04), pp.
     236–249, 2004. [doi:10.1109/CCC.2004.1313847] 260, 263

 [8] L UC D EVROYE: Nonuniform Random Variate Generation. Springer, 1986. Found on author’s
     website. 266

 [9] J OE D IESTEL , H ANS JARCHOW, AND A NDREW T ONGE: Absolutely Summing Operators.
     Volume 43 of Cambridge Studies in Advanced Mathematics. Cambridge Univ. Press, 1995.
     [doi:10.1017/CBO9780511526138] 282

[10] C HRIS D ING , D ING Z HOU , X IAOFENG H E , AND H ONGYUAN Z HA: R1 -PCA: Rotational invariant
     L1 -norm principal component analysis for robust subspace factorization. In Proc. 23rd Internat. Conf.
     on Machine Learning (ICML’06), pp. 281–288. ACM Press, 2006. [doi:10.1145/1143844.1143880]
     261

[11] I AN L. D RYDEN AND K ANTI V. M ARDIA: Statistical Shape Analysis. Wiley, 1998. 262

[12] A LAN M. F RIEZE AND R AVI K ANNAN: Quick approximation to matrices and applications.
     Combinatorica, 19(2):175–220, 1999. [doi:10.1007/s004930050052] 260, 262, 263, 290

[13] M ICHEL X. G OEMANS AND DAVID P. W ILLIAMSON: Improved approximation algorithms for
     maximum cut and satisfiability problems using semidefinite programming. J. ACM, 42(6):1115–
     1145, 1995. Preliminary version in STOC’94. [doi:10.1145/227683.227684] 268

[14] J OHN C. G OWER: Generalized Procrustes analysis.             Psychometrika, 40(1):33–51, 1975.
     [doi:10.1007/BF02291478] 262

[15] J OHN C. G OWER AND G ARMT B. D IJKSTERHUIS: Procrustes Problems. Volume 30 of Oxford
     Statistical Science Series. Oxford University Press, 2004. 262

[16] A LEXANDER G ROTHENDIECK: Résumé de la théorie métrique des produits tensoriels topologiques.
     Bol. Soc. Mat. São Paulo, 8:1–79, 1953. 263, 265

[17] M ARTIN G RÖTSCHEL , L ÁSZLÓ L OVÁSZ , AND A LEXANDER S CHRIJVER: Geometric Algorithms
     and Combinatorial Optimization. Volume 2 of Algorithms and Combinatorics. Springer, second
     edition, 1993. [doi:10.1007/978-3-642-78240-4] 264

[18] U FFE H AAGERUP: The Grothendieck inequality for bilinear forms on C∗ -algebras. Adv. in Math.,
     56(2):93–116, 1985. [doi:10.1016/0001-8708(85)90026-X] 263, 265, 266, 267, 268, 273, 274

[19] U FFE H AAGERUP: A new upper bound for the complex Grothendieck constant. Israel J. Math.,
     60(2):199–224, 1987. [doi:10.1007/BF02790792] 265

[20] U FFE H AAGERUP AND TAKASHI I TOH: Grothendieck type norms for bilinear forms on C∗ -algebras.
     J. Operator Theory, 34(2):263–283, 1995. [JOT]. 266, 274

                    T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                           292
           E FFICIENT ROUNDING FOR THE N ONCOMMUTATIVE G ROTHENDIECK I NEQUALITY

[21] U FFE H AAGERUP AND M AGDALENA M USAT: The Effros-Ruan conjecture for bilinear forms on
     C∗ -algebras. Invent. Math., 174(1):139–163, 2008. [doi:10.1007/s00222-008-0137-7] 263

[22] G RAHAM J. O. JAMESON: The interpolation proof of Grothendieck’s inequality. In Proc. Edinburgh
     Math. Soc., volume 28, pp. 217–223, 1985. [doi:10.1017/S0013091500022653] 282

[23] N ORMAN L. J OHNSON , S AMUEL KOTZ , AND NARAYANASWAMY BALAKRISHNAN: Continuous
     Univariate Distributions. Volume 2. Wiley, 1995. 266

[24] W ILLIAM B. J OHNSON AND J ORAM L INDENSTRAUSS: Basic concepts in the geometry of Banach
     spaces. In Handbook of the geometry of Banach spaces, Vol. I, pp. 1–84. North-Holland, 2001.
     [doi:10.1016/S1874-5849(01)80003-6] 282

[25] S TEN K AIJSER: A simple-minded proof of the Pisier-Grothendieck inequality. In RON B LEI AND
     S TUART S IDNEY, editors, Banach Spaces, Harmonic Analysis, and Probability Theory, volume
     995 of Lecture Notes in Mathematics, pp. 33–55. Springer, 1983. [doi:10.1007/BFb0061887] 265,
     282, 284

[26] S UBHASH K HOT AND A SSAF NAOR: Grothendieck-type inequalities in combinatorial optimization.
     Commun. Pure Appl. Math., 65(7):992–1035, 2012. [doi:10.1002/cpa.21398] 260, 262

[27] J EAN -L OUIS K RIVINE: Théorèmes de factorisation dans les espaces réticulés. In Séminaire Maurey-
     Schwartz 1973–1974: Espaces L p , applications radonifiantes et géométrie des espaces de Banach,
     Exp. Nos. 22 et 23, p. 22. Centre de Math., École Polytech., Paris, 1974. Found at EUDML. 282

[28] J EAN -L OUIS K RIVINE: Constantes de Grothendieck et fonctions de type positif sur les sphères.
     Adv. in Math., 31(1):16–30, 1979. [doi:10.1016/0001-8708(79)90017-3] 276

[29] N OJUN K WAK: Principal component analysis based on L1 -norm maximization. IEEE Trans. Pattern
     Anal. Mach. Intell., 30(9):1672–1680, 2008. [doi:10.1109/TPAMI.2008.114] 261

[30] J ORAM L INDENSTRAUSS AND A LEKSANDER P EŁCZY ŃSKI: Absolutely summing operators in
     L p -spaces and their applications. Studia Math., 29(3):275–326, 1968. Found at EUDML. 264, 265

[31] L ÁSZLÓ L OVÁSZ AND BALÁZS S ZEGEDY: Szemerédi’s lemma for the analyst. Geom. Funct.
     Anal., 17(1):252–270, 2007. [doi:10.1007/s00039-007-0599-6] 263

[32] M ICHAEL M C C OY AND J OEL A. T ROPP: Two proposals for robust PCA using semidefinite
     programming. Electron. J. Statist., 5:1123–1160, 2011. [doi:10.1214/11-EJS636] 261

[33] E LISABETH M ORAND AND J ÉRÔME PAGÈS: Analyse factorielle multiple procustéenne. J. Soc. Fr.
     Stat. & Rev. Stat. Appl., 148(2):65–97, 2007. Found at EUDML. 262

[34] A SSAF NAOR , O DED R EGEV, AND T HOMAS V IDICK: Efficient rounding for the non-
     commutative Grothendieck inequality. In Proc. 45th STOC, pp. 71–80. ACM Press, 2013.
     [doi:10.1145/2488608.2488618] 257

                   T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                         293
                          A SSAF NAOR , O DED R EGEV, AND T HOMAS V IDICK

[35] A RKADI N EMIROVSKI: Sums of random symmetric matrices and quadratic optimization under
     orthogonality constraints. Math. Program., 109(2-3):283–317, 2007. [doi:10.1007/s10107-006-
     0033-0] 262, 289

[36] G ILLES P ISIER: Grothendieck’s theorem for noncommutative C∗ -algebras, with an appendix on
     Grothendieck’s constants. J. Funct. Anal., 29(3):397–415, 1978. [doi:10.1016/0022-1236(78)90038-
     1] 263, 265, 282, 283

[37] G ILLES P ISIER: Grothendieck’s theorem, past and present. Bull. Amer. Math. Soc., 49(2):237–323,
     2012. [doi:10.1090/S0273-0979-2011-01348-9, arXiv:1101.4195] 264, 266

[38] G ILLES P ISIER AND D IMITRI S HLYAKHTENKO: Grothendieck’s theorem for operator spaces.
     Invent. Math., 150(1):185–217, 2002. [doi:10.1007/s00222-002-0235-x] 263

[39] JAMES A. R EEDS: A new lower bound on the real Grothendieck constant. Unpublished manuscript,
     available at author’s home page, 1991. 264

[40] O DED R EGEV AND T HOMAS V IDICK: Elementary proofs of Grothendieck theorems for completely
     bounded norms. J. Operator Theory, 71(2):491–506, 2014. [doi:10.7900/jot.2012jul02.1947,
     arXiv:1206.4025] 263

[41] O DED R EGEV AND T HOMAS V IDICK: Quantum XOR games. ACM Transactions on Computation
     Theory (ToCT), 2014. To appear. Preliminary version in CCC’13. [arXiv:1207.4939] 263, 278

[42] A LEXANDER S HAPIRO AND J OHAN D. B OTHA: Dual algorithms for orthogonal Procrustes
     rotations. SIAM J. Matrix Anal. Appl., 9(3):378–383, 1988. [doi:10.1137/0609032] 262

[43] A NTHONY S O: Moment inequalities for sums of random matrices and their applications in opti-
     mization. Math. Program., 130(1):125–151, 2011. [doi:10.1007/s10107-009-0330-5] 261, 262,
     289

[44] M IKKEL B. S TEGMANN AND DAVID D ELGADO G OMEZ: A Brief Introduction to Statistical Shape
     Analysis, 2002. Available at IMM. 262

[45] B ORIS S. T SIRELSON: Quantum analogues of the Bell inequalities. The case of two spatially
     separated domains. J. Soviet Math., 36(4):557–570, 1987. [doi:10.1007/BF01663472] 260


AUTHORS

      Assaf Naor
      Professor
      Courant Institute of Mathematical Sciences, New York University, New York, NY
      naor cims nyu edu
      http://www.cims.nyu.edu/~naor/



                   T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                       294
         E FFICIENT ROUNDING FOR THE N ONCOMMUTATIVE G ROTHENDIECK I NEQUALITY

    Oded Regev
    Professor
    Courant Institute of Mathematical Sciences, New York University, New York, NY
    regev cims nyu edu
    http://www.cims.nyu.edu/~regev/


    Thomas Vidick
    Assistant Professor
    California Institute of Technology, Pasadena, CA
    vidick cms caltech edu
    http://cms.caltech.edu/~vidick


ABOUT THE AUTHORS

    A SSAF NAOR’s research focuses on analysis and metric geometry, and their interactions
       with approximation algorithms and complexity theory. He received his Ph. D. from
       the Hebrew University in 2002, advised by Joram Lindenstrauss. He is a Professor of
       Mathematics and Computer Science at the Courant Institute of Mathematical Sciences of
       New York University, where he has been a faculty member since 2006. Prior to joining
       the Courant Institute he was a researcher at the Theory Group of Microsoft Research in
       Redmond WA. Starting fall 2014 he will be a Professor of Mathematics at Princeton
       University.


    O DED R EGEV graduated from Tel Aviv University in 2001 under the supervision of Yossi
       Azar. He spent two years as a postdoc at the Institute for Advanced Study, Princeton,
       and one year at the University of California, Berkeley. He recently joined the Courant
       Institute of Mathematical Sciences and is still trying to get used to life in NYC. His
       research interests include computational aspects of lattices, quantum computation, and
       other topics in theoretical computer science.


    T HOMAS V IDICK graduated from UC Berkeley in 2011; his advisor was Umesh Vazirani.
       His thesis focused on the study of quantum entanglement in multi-prover interactive
       proof systems and in quantum cryptography. After a postdoctoral scholarship at MIT
       under the supervision of Scott Aaronson, he moved back to sunny California. He is
       currently an assistant professor in Caltech’s department of Computing and Mathematical
       Sciences, where his research is stimulated by the humbling mark left by the previous
       occupants of his his office and his neighbors’—Alexei Kitaev and Richard Feynman.




                T HEORY OF C OMPUTING, Volume 10 (11), 2014, pp. 257–295                        295