Complete Convergence of Message Passing Algorithms for Some Satisfiability Problems

Authors Uriel Feige, Elchanan Mossel, Dan Vilenchik,
Plaintext
                         T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651
                                       www.theoryofcomputing.org




 Complete Convergence of Message Passing
 Algorithms for Some Satisfiability Problems∗
                  Uriel Feige                  Elchanan Mossel†                 Dan Vilenchik
                    Received January 22, 2012; Revised May 4, 2013; Published June 15, 2013




        Abstract: In this paper we analyze the performance of Warning Propagation, a popular
        message passing algorithm. We show that for 3CNF formulas drawn from a certain distribu-
        tion over random satisfiable 3CNF formulas, commonly referred to as the planted-assignment
        distribution, running Warning Propagation in the standard way (run message passing until
        convergence, simplify the formula according to the resulting assignment, and satisfy the
        remaining subformula, if necessary, using a simple “off the shelf” heuristic) works when the
        clause-variable ratio is a sufficiently large constant. We are not aware of previous rigorous
        analysis showing the effectiveness of message passing algorithms for satisfiability instances.

ACM Classification: F.2.2, G.3
AMS Classification: 68Q25, 90C59
Key words and phrases: constraint satisfaction, average case, SAT, statistical physics

1     Introduction
A CNF formula over the variables x1 , x2 , . . . , xn is a conjunction of clauses C1 ,C2 , . . . ,Cm where each
clause is a disjunction of one or more literals. Each literal is either a variable or its negation. A formula is
said to be in k-CNF form if every clause contains exactly k literals. A CNF formula is satisfiable if there is
a Boolean assignment to the variables such that every clause contains at least one literal which evaluates
to true. 3SAT, the language of all satisfiable 3CNF formulas, is well known to be NP-complete [13].
    ∗ An extended abstract of this paper appeared in the Proceedings of RANDOM 2006 [16].
    † Supported by DOD ONR grants N000141110140 and N0014-07-1-05-06.



© 2013 Uriel Feige, Elchanan Mossel, and Dan Vilenchik
c b Licensed under a Creative Commons Attribution License (CC-BY)                           DOI: 10.4086/toc.2013.v009a019
                        U RIEL F EIGE , E LCHANAN M OSSEL , AND DAN V ILENCHIK

     The plethora of worst-case NP-hardness results for many interesting optimization problems motivates
the study of heuristics that give “useful” answers for “typical” subset of the problem instances. In this
paper we seek to evaluate those two measures rigorously and for that we shall use random models and
average case analysis.
    A random 3CNF formula consists of m clauses chosen uniformly at random from the set of all
8 n3 possible ones (m is a parameter of the distribution, and m/n is referred to as the clause-variable
ratio, or the density, of the random instance). This distribution shows a sharp threshold with respect
to satisfiability [19]. Specifically, a random 3CNF with clause-variable ratio below the threshold is
satisfiable w. h. p. (with high probability, meaning with probability tending to 1 as n goes to infinity )
and one with ratio above the threshold is unsatisfiable w. h. p. This threshold is not known exactly (and
not even known to be independent of n), known bounds are at least 3.52 [21] and at most 4.506 [15].
Experimental results indicate that the threshold is closer to the higher end of the interval [14], and lies
roughly around 4.2.
     In this paper we study random 3CNF formulas with an arbitrary density. To this end, the aforemen-
tioned distribution needs to be refined, as otherwise, at some density, most formulas stop being satisfiable.
One possibility, favored by many researchers in similar settings [18, 5, 22, 23, 4, 9] is to use the planted
distribution, denoted throughout by Pplant
                                        n,p . A random 3CNF in this distribution is obtained by first picking
an assignment ϕ to the variables, and then including every clause satisfied by ϕ with probability p = p(n),
thus guaranteeing that the resulting instance is satisfiable.
     Our contribution is analyzing the performance of Warning Propagation (WP for brevity), a popular
message passing algorithm, when applied to satisfiable formulas drawn from a certain random distribution
over satisfiable 3CNF formulas, commonly called the planted distribution. We show that the standard way
of running message passing algorithms—run message passing until convergence, simplify the formula
according to the resulting assignment, and satisfy the remaining subformula, if possible, using a simple
“off the shelf” heuristic—works for planted random satisfiable formulas with a sufficiently large yet
constant clause-variable ratio. The effectiveness of message passing algorithms was experimentally shown
for “hard” formulas [7], for which other heuristics fail; a decimation process based on Belief Propagation
was analyzed in [12], where it was shown that the process fails to find a satisfying assignment when the
input is a random formula with a certain clause-variable ratio (below the satisfiability threshold).
     Our result is the first to rigorously prove the effectiveness of a message passing algorithm for the
solution of a non-trivial random SAT distribution. Let us now move to a formal description of our results.


1.1   Warning propagation
WP is a simple iterative message passing algorithm, that serves as an excellent intuitive introduction to
more involved message passing algorithms such as Belief Propagation [26] and Survey Propagation [7].
These algorithms are based on the cavity method in which the messages that a clause (or a variable)
receives are meant to reflect a situation in which a “cavity” is formed, namely, the receiving clause (or
variable) is no longer part of the formula. Messages in the WP algorithm can be interpreted as “warnings,”
telling a clause the values that variables will have if the clause “keeps quiet” and does not announce
its wishes, and telling a variable which clauses will not be satisfied if the variable does not commit to
satisfying them. We now present the algorithm in a formal way.

                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                            618
                    C ONVERGENCE OF M ESSAGE PASSING FOR S ATISFIABILITY P ROBLEMS

    Let F be a CNF formula. For a variable x, let N + (x) be the set of clauses in F in which x appears
positively (namely, as the literal x), and N − (x) be the set of clauses in which x appears negatively. For
a clause C, let N + (C) be the set of variables that appear positively in C, and respectively N − (C) for
negative ones. There are two types of messages involved in the WP algorithm. Messages sent from a
variable xi to a clause C j in which it appears:

                             xi → C j =            ∑             Ck → xi −            ∑             Ck → xi .
                                           Ck ∈N + (xi ),k6= j                Ck ∈N − (xi ),k6= j

If xi appears only in C j then we set the message to 0. The intuitive interpretation of this message should
be xi signals C j what is currently its favorable assignment by the other clauses it appears in (a positive
message means TRUE, negative one means FALSE and a 0 message means UNASSIGNED). The second
type are messages sent from a clause C j to a variable xi appearing in C j :

                        C j → xi =          ∏              I<0 (xk → C j )          ∏              I>0 (xk → C j ) ,
                                     xk ∈N + (C j ),k6=i                     xk ∈N − (C j ),k6=i

where I<0 (b) is an indicator function which is “1” iff b < 0 (respectively I>0 ). If C j contains only xi
(which cannot be the case in 3CNF formulas) then the message is set to 1. C j → xi = 1 can be intuitively
interpreted as C j sending a warning to xi asking it to commit to satisfying C j (as all other literals signaled
C j that currently they evaluate to FALSE). Lastly, we define the current assignment of a variable xi to be

                                     Bi =        ∑           C j → xi −       ∑            C j → xi .
                                            C j ∈N + (xi )                C j ∈N − (xi )

If Bi > 0 then x is assigned TRUE, if Bi < 0 then xi is assigned FALSE, otherwise xi is UNASSIGNED .
Assume some order on the clause-variable messages (e. g., the lexicographical order on pairs of the form
( j, i) representing the message C j → xi ). Given a vector α ∈ {0, 1}3m in which every entry is the value of
the corresponding C j → xi message, a partial assignment ψ ∈ {TRUE, FALSE, UNASSIGNED}n can be
generated according to the corresponding Bi values (as previously explained).
      Given a 3CNF formula F on n variables and m clauses, the factor graph of F, denoted by FG(F), is
the following graph representation of F. The factor graph is a bipartite graph, FG(F) = (V1 ∪V2 , E) where
V1 = {x1 , x2 , . . . , xn } (the set of variables) and V2 = {C1 ,C2 , . . . ,Cm } (the set of clauses). (xi ,C j ) ∈ E iff xi
appears in C j . For a 3CNF F with m clauses it holds that #E = 3m, because every clause contains exactly
3 different variables. (Here and elsewhere, #A denotes the cardinality of a set A. The notation |a| will
denote the absolute value of a real number a.)
      It would be convenient to think of the messages in terms of the corresponding factor graph. Every
undirected edge (xi ,C j ) of the factor graph is replaced with 2 anti-parallel directed edges, (xi → C j )
associated with the message xi → C j and respectively the edge (C j → xi ).
      Let us now formally describe the algorithm.

Warning Propagation(3CNF formula F) :
1. construct the corresponding factor graph FG(F).
2. randomly initialize the clause-variable messages to 0 or 1.
3. repeat until no clause-variable message changed from the

                        T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                                          619
                        U RIEL F EIGE , E LCHANAN M OSSEL , AND DAN V ILENCHIK

   previous iteration:
   3.a randomly order the edges of FG(F).
   3.b update all clause-variable messages C j → xi according
       to the random edge order.
4. compute a partial assignment ψ according to the Bi messages.
5. return ψ.

     The variable-clause message updates are implicit in line 3.b: when evaluating the clause-variable
message along the edge C → x, C = (x ∨ y ∨ z), the variable-clause messages concerning this calculation
(z, y → C) are evaluated on-the-fly using the last updated values Ci → y, C j → z (allowing feedback from
the same iteration). We allow the algorithm not to terminate (the clause-variable messages may keep
changing every iteration). If the algorithm does return an assignment ψ then we say that it converged. In
practice it is common to limit in advance the number of iterations, and if the algorithm does not converge
by then, return a failure.

1.2   Our results
Given a 3CNF F, simplify F according to ψ, when ψ is a partial assignment, means: in every clause
substitute every assigned variable with the value given to it by ψ. If a clause contains a literal which
evaluates to true, remove the clause. From the remaining clauses, remove all literals which evaluate to
false. The resulting instance is not necessarily in 3CNF form, as clauses may have any number of literals
between 0 and 3. Denote by F|ψ the 3CNF F simplified according to ψ. Note that F|ψ may contain empty
clauses, in which case it is not satisfiable. For a set of variables A ⊆ V , denote by F[A] the set of clauses
in which all variables belong to A.
    We call a 3CNF formula simple, if it can be satisfied using simple well-known heuristics (examples
include very sparse random 3CNF formulas which are solvable w. h. p. using the pure-literal heuristic [8],
formulas with small weight terminators—to use the terminology of [3]—solvable w. h. p. using RWalkSat,
etc). This is a somewhat informal notion, but it suffices for our needs.
Theorem 1.1. Let F be a 3CNF formula randomly sampled according to Pplant                           2
                                                                                n,p , where p ≥ d/n , d a
sufficiently large constant. Then the following holds w. h. p. (the probability taken over the choice of F,
and the random choices in lines 2 and 4 of the WP algorithm). There exists a satisfying assignment ϕ ∗
(not necessarily the planted one) such that:

   1. WP(F) converges after at most O(log n) iterations.
   2. Let ψ be the partial assignment returned by WP(F), let VA denote the variables assigned to either
      TRUE or FALSE in ψ, and VU the variables left UNASSIGNED . Then for every variable x ∈ VA ,
      ψ(x) = ϕ ∗ (x). Moreover, #VA ≥ (1 − e−Θ(d) )n.
   3. F|ψ is a simple formula which can be satisfied in time O(n).

Remark 1.2. Theorem 1.1 relates to the planted 3SAT model, but [11] implies that it also true to the
uniform random 3SAT distribution, in which a satisfiable formula with m clauses is chosen uniformly at
random among all such formulas.

                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                              620
                  C ONVERGENCE OF M ESSAGE PASSING FOR S ATISFIABILITY P ROBLEMS

Proposition 1.3. Let F be a 3CNF formula randomly sampled according to Pplant                          2
                                                                             n,p , where p ≥ c log n/n ,
with c a sufficiently large constant, and let ϕ be its planted assignment. Then w. h. p. after at most 2
iterations, WP(F) converges, and the returned ψ equals ϕ.


     It is worth noting that formulas in Pplant         2
                                            n,p , with n p some large constant, are not known to be simple
(in the sense that we alluded to above). For example, it is shown in [3] that RWalkSat is very unlikely to
hit a satisfying assignment in polynomial time when running on a random Pplant    n,p instance in the setting
of Theorem 1.1. Nevertheless, planted 3CNF formulas with sufficiently large (constant) density were
shown to be solvable w. h. p. in [18] using a spectral algorithm. Though in our analysis we use similar
techniques to [18] (which relies himself on [4]), our result is conceptually different in the following sense.
In [4, 9, 18], the starting point is the planted distribution, and then one designs an algorithm that works
well under this distribution. The algorithm may be designed in such a way that makes its analysis easier.
In contrast, our starting point is a given algorithm, WP, and then we ask for which input distributions it
works well. We cannot change the algorithm in ways that would simplify the analysis. Another difference
between our work and that of [4, 9, 18] is that unlike the algorithms analyzed in those other papers, WP
is a randomized algorithm, a fact which makes its analysis more difficult. We could have simplified our
analysis had we changed WP to be deterministic (for example, by initializing all clause-variable messages
to 1 in step 2 of the algorithm), but there are good reasons why WP is randomized. For example, it can
be shown that (the randomized version) WP converges with probability 1 on 2CNF formulas that form
one cycle of implications, but might not converge if step 4 does not introduce fresh randomness in every
iteration of the algorithm (details omitted).

     The planted 3SAT model is also similar to LDPC codes in many ways. Both constructions are based
on random factor graphs. In codes, the received corrupted codeword provides noisy information on a
single bit or on the parity of a small number of bits of the original codeword. In Pplant  n,p , ϕ being the
planted assignment, the clauses containing a variable xi provide noisy information on the polarity of
ϕ(xi ). Comparing our results with the coding setting, the effectiveness of message passing algorithms for
amplifying local information in order to decode codes close to channel capacity was recently established
in a number of papers, e. g., [25, 27]. Our results are similar in flavor, however the combinatorial analysis
provided here allows to recover an assignment satisfying all clauses, whereas in the random LDPC codes
setting, message passing allows to recover only 1 − o(1) fraction of the codeword correctly. In [24] it is
shown that for the erasure channel, all bits may be recovered correctly using a message passing algorithm,
however in this case the LDPC code is designed so that message passing works for it.



1.3   Paper’s organization

The remainder of the paper is structured as follows. Section 2 provides an overview that may help the
reader follow the more technical parts of the proofs. In Section 3 we discuss some properties that a
typical instance in Pplant
                      n,p possesses. Using these properties, we prove in Section 4 Theorem 1.1 and
Proposition 1.3. In Section 6 we summarize our results and discuss potentially interesting lines for further
research.

                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                             621
                         U RIEL F EIGE , E LCHANAN M OSSEL , AND DAN V ILENCHIK

2    Proof outline
Let us first consider some possible fixed points of the Warning Propagation (WP) algorithm. The trivial
fixed point is the one in which all messages are 0. One may verify that this is the unique fixed point
in some cases when the underlying 3CNF formula is very easy to satisfy, such as when all variables
appear only positively, or when every clause contains at least two variables that do not appear in any other
clause. A local maximum fixed point is one that corresponds to a strict local maximum of the underlying
MAX-3SAT instance, namely to an assignment τ to the variables in which flipping the truth assignment of
any single variable causes the number of satisfied clauses to strictly decrease. The reader may verify that
if every clause C sends a 1 message to a variable if no other variable satisfies C under τ, and a 0 message
otherwise, then this is indeed a fixed point of the WP algorithm. Needless to say, the WP algorithm may
have other fixed points, and might not converge to a fixed point at all.
     Recall the definition of Pplant
                                 n,p . First a truth assignment ϕ to the variables V = {x1 , x2 , . . . , xn } is
picked uniformly at random. Next, every clause satisfied by ϕ is included in the formula    with probability
p (in our case p ≥ d/n2 , d a sufficiently large                            3 − 1) · n clauses satisfied by
                                                                                      
                                                    constant). There  are (2         3
ϕ, hence the expected size of F is p · 7 · n3 = 7dn/6 + o(n) (when d is constant, then this is linear in
                                              

n, and therefore such instances are sometimes referred to as sparse 3CNF formulas). To simplify the
presentation, we may assume due to symmetry that the planted assignment ϕ is the all-one vector.
     To aid intuition, we list some (incorrect) assumptions and analyze the performance of WP on a Pplant    n,p
instance under these assumptions.

    1. In expectation, a variable appears in 4 n2 p = 2d + o(1) clauses positively, and in 3d/2 + o(1)
                                                 

       clauses negatively. Our first assumption is that for every variable, its number of positive and
       negative appearances is equal to these expectations.

    2. We say that a variable supports a clause with respect to the planted assignment (which was assumed
       without loss of generality to be the all 1 assignment) if it appears positively in the clause, and
       the other variables in the clause appear negatively. Hence the variable is the only one to satisfy
       the clause under the planted assignment. For every variable in expectation there are roughly d/2
       clauses that it supports. Our second assumption is that for every variable, the number of clauses
       that it supports is equal to this expectation.

    3. Recall that in the initialization of the WP algorithm, every clause-variable message C → x is 1 w. p.
       1/2, and 0 otherwise. Our third assumption is that with respect to every variable, half the messages
       that it receives from clauses in which it is positive are initialized to 1, and half the messages that it
       receives from clauses in which it is negative are initialized to 1.

    4. Recall that in step 3.b of WP, clause-variable messages are updated in a random order. Our fourth
       assumption is that in each iteration of step 3, the updates are based on the values of the other
       messages from the previous iteration, rather than on the last updated values of the messages (that
       may correspond either to the previous iteration or the current iteration, depending on the order
       in which clause-variable messages are visited). Put differently, we assume that in step 3.b all
       clause-variable messages are evaluated in parallel.

                      T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                                622
                  C ONVERGENCE OF M ESSAGE PASSING FOR S ATISFIABILITY P ROBLEMS

    Observe that under the first two assumptions, the planted assignment is a local maximum of the
underlying MAX-3SAT instance. We show that under the third and fourth assumption, WP converges
to the corresponding local maximum fixed point in two iterations. Based on the initial messages as in
our third assumption, the messages that variables send to clauses are all roughly (2d − 3d/2)/2 = d/4.
Following the initialization, in the first iteration of step 3 every clause C that x supports will send x the
message 1, and all other messages will be 0. Here we used our fourth assumption. (Without our fourth
assumption, WP may run into trouble as follows. The random ordering of the edges in step 3 may place
for some variable x all messages from clauses in which it appears positively before those messages from
clauses in which it appears negatively. During the iteration, some of the messages from the positive
clauses may change from 1 to 0. Without our fourth assumption, this may at some point cause x to signal
to some clauses a negative rather than positive value.) The set of clause-variable messages as above will
become a fixed point and repeat itself in the second iteration of step 3. (For the second iteration, the
fourth assumption is no longer needed.) Hence the algorithm will terminate after the second iteration.
    Unfortunately, none of the four assumptions that we made are correct. Let us first see to what extent
they are violated in the context of Proposition 1.3, namely, when d is very large, significantly above log n.
Standard concentration results for independent random variables then imply that the first, second and
third assumptions simultaneously hold for all variables, up to small error terms that do not affect the
analysis. Our fourth assumption is of course never true, simply because we defined WP differently. This
complicates the analysis to some extent and makes the outcome depend on the order chosen in the first
iteration of step 3.a of the algorithm. However, it can be shown that for most such orders, the algorithm
indeed converges to the fixed point that corresponds to the planted assignment.
    The more difficult part of our work is the case when d is constant (though a sufficiently large constant),
as in the case of Theorem 1.1. In this case, already our first two assumptions are incorrect. Random
fluctuations with respect to expected values will w. h. p. cause a linear fraction of the variables to appear
negatively more often than positively, or not to support any clause (with respect to the planted assignment).
In particular, the planted assignment would no longer be a local maximum with respect to the underlying
MAX-3SAT instance. Nevertheless, as is known from previous work [18], a large fraction of the variables
will behave sufficiently close to expectation so that the planted assignment is a local maximum with
respect to these variables. Slightly abusing notation, these set of variables are often called the core of
the 3CNF formula. Our proof plan is to show that WP does converge, and that the partial assignment in
step 4 assigns all core variables their correct planted value. Moreover, for non-core variables, we wish
to show that the partial assignment does not make any unrecoverable error—whatever value it assigns
to some of them, it is always possible to assign values to those variables that are left unassigned by the
partial assignment so that the input formula is satisfied. The reason why we can expect such a proof plan
to succeed is that it is known to work if one obtains an initial partial assignment by means other than WP,
as was already done in [18, 17].
    Let us turn now to our third assumption. It too is violated for a linear fraction of the variables, but
is nearly satisfied for most variables. This fact marks one point of departure for our work compared to
previous work [18, 17]. Our definition of the core variables will no longer depend only on the input
formula, but also on the random choice of initialization messages. This adds some technical complexity
to our proofs.
    The violation of the fourth assumption is perhaps the technical part in which our work is most

                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                             623
                        U RIEL F EIGE , E LCHANAN M OSSEL , AND DAN V ILENCHIK

interesting. It relates to the analysis of WP on factor graphs that contain cycles, which is often a
stumbling point when one analyzes message passing algorithms. Recall that when d is very large
(Proposition 1.3), making the fourth assumption simplifies the proof of convergence of WP. Hence
removing this assumption in that case becomes a nuisance. On the other hand, when d is smaller (as
in Theorem 1.1), removing this assumption becomes a necessity. This will become apparent when we
analyze convergence of WP on what we call free cycles. If messages in step 3.b of WP are updated based
on the value of other messages in the previous iteration (as in our fourth assumption), then the random
choice of order in step 3.a of WP does not matter, and one can design examples in which the messages in
a free cycle never converge. In contrast, if messages in step 3.b of WP are updated based on the latest
value of other messages (either from the previous iteration or from the current iteration, whichever one is
applicable), free cycles converge with probability 1 (as we shall later show).
     To complete the proof plan, we still need to show that simplifying the input formula according
to the partial assignment returned by WP results in a formula that is satisfiable, and moreover, that a
satisfying assignment for this sub-formula can easily be found. The existential part (the sub-formula being
satisfiable) will follow from a careful analysis of the partial assignment returned by WP. The algorithmic
part (easily finding an assignment that satisfies the sub-formula) is based on the same principles used
in [4, 18], showing that the sub-formula breaks into small connected components.

                                          plant
3     Properties of a random Pn,p instance
In this section we discuss relevant properties of a random Pplant
                                                            n,p instance. This section is rather technical
in nature. The proofs are based on probabilistic arguments that are standard in our context. In the rest of
the paper, for simplicity of presentation, we assume due to symmetry that the planted assignment is the
all TRUE assignment.

3.1   Preliminaries
The following well-known concentration result (see, for example [6, Corollary A.1.14]) will be used
several times in the proof. We denote by B(n, p) the binomial random variable with parameters n and p.

Theorem 3.1 (Chernoff’s inequality). If X ∼ B(n, p) and ε > 0, then
                                                              3
                                 Pr |X − E[X]| ≥ εE[X] ≤ e−Ω(ε E[X]) .
                                                      


     We shall also use the following version of the Martingale argument, which can be found in [6, Page
101]. Assume the probability space is generated by a finite set of mutually independent Yes/No choices,
indexed by i ∈ I. Given a random variable Y on this space, let pi denote the probability that choice i is
Yes. Let ci be such that changing choice i (keeping all else the same) can change Y by at most ci ≤ c. We
call pi (1 − pi )c2 the variance of choice i. Now consider a solitaire game in which one finds the value of
Y by making queries of an always truthful oracle. The choice of query can depend on previous responses.
All possible questioning lines can be naturally represented in a decision tree form. A “line of questioning”
is a path from the root to a leaf of this tree, a sequence of questions and responses that determine Y . The
total variance of a line of questioning is the sum of the variances of the queries in it.

                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                           624
                  C ONVERGENCE OF M ESSAGE PASSING FOR S ATISFIABILITY P ROBLEMS

Theorem 3.2. For every ε > 0 there exists a δ > 0 so that if for every line of questioning that determines
Y , the total variance is at most σ 2 , then
                                                                2
                                    Pr[|Y − E[Y ]| > ασ ] ≤ 2eα /(2(1+ε)) ,

for all positive α with αC < σ (1 + ε)δ .

3.2   Stable variables
Definition 3.3. A variable x supports a clause C with respect to a partial assignment ψ, if it is the only
variable to satisfy C under ψ, and the other two variables are assigned by ψ.

Proposition 3.4. Let F be a 3CNF formula randomly sampled according to Pplant                        2
                                                                               n,p , where p ≥ d/n , d a
sufficiently large constant. Let FSUPP be a random variable counting the number of variables in F whose
support w.r.t. ϕ is less than d/3. Then w. h. p. FSUPP ≤ e−Θ(d) n.

Proof. Every variable is expected to support (d/n2 ) · n2 = d/2 + O(1/n) clauses, thus using Chernoff’s
                                                         

inequality the probability of a single variable supporting less than d/3 clauses is at most e−Θ(d) . Using
linearity of expectation, E[FSUPP ] ≤ e−Θ(d) n. To prove concentration around the expected value we use
Chernoff’s bound once more as the support of one variable is independent of the others (since it concerns
different clauses which are included independently of each other). The claim then follows.

     Following the definitions in Section 1.1, given a CNF F and a variable x, we let N ++ (x) be the set of
clauses in F in which x appears positively but does not support w. r. t. ϕ. Let N s (x) be the set of clause
in F which x supports w. r. t. ϕ. Let π = π(F) be some ordering of the clause-variable message edges
in the factor graph of F. For an index i and a literal `x (by `x we denote a literal over the variable x) let
π −i (`x ) be the set of clause-variable edges (C → x) that appear before index i in the order π and in which
x appears in C as `x . For a set of clause-variable edges E and a set of clauses C we denote by E ∩ C the
subset of edges containing a clause from C as one endpoint.

Definition 3.5. Let π be an ordering of the clause-variable messages of a formula F. A variable x is
stable in F w. r. t. π if for every location i in π that contains a message C → x, C = (`x ∨ `y ∨ `z ), the
following holds:

   1. |#π −i (y) ∩ N ++ (y) − #π −i (ȳ) ∩ N − (y)| ≤ d/30,

   2. |#N ++ (y) − #N − (y)| ≤ d/30,

   3. #N s (y) ≥ d/3,

and the same holds for z.

Proposition 3.6. Let F be a 3CNF formula randomly sampled according to Pplant n,p , where p ≥ d/n ,
                                                                                                    2

d a sufficiently large constant. Let π be a random ordering of the clause-variable messages, and
FUNSTAB be a random variable counting the number of variables in F which are not stable. Then w. h. p.
FUNSTAB ≤ e−Θ(d) n.

                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                             625
                        U RIEL F EIGE , E LCHANAN M OSSEL , AND DAN V ILENCHIK

Proof. We start by bounding E[FUNSTAB ]. Observe that given that y appears in a clause C, and does not
support it, the probability of y appearing positively or negatively is equal. Hence, for a message C → x in
location i in π, C = (`x ∨ `y ∨ `z ), we have

                             E[|#π −i (y) ∩ N ++ (y) − #π −i (ȳ) ∩ N − (y)|] = 0 ,

and the same for z. If however

                             |#π −i (y) ∩ N ++ (y) − #π −i (ȳ) ∩ N − (y)| > d/30

then at least one of the quantities deviates from its expectation by d/60.
    Look at #π −i (y) ∩ N ++ (y)—this is the number of successes in draws without replacement. It is
known that this quantity is more concentrated than the corresponding quantity if the draws were made
with replacement [20]. In particular, since the expectation of #π −i (y) ∩ N ++ (y) is O(d) it follows from
Chernoff’s bound that the probability that it deviates from its expectation by more than d/60 is e−Θ(d) .
A similar statement holds for #π −i (ȳ) ∩ N − (y). Properties (2) and (3) are bounded similarly using
concentration results.
    The calculations above hold in particular for the first 5d appearances of messages involving x. As for
message 5d + 1, the probability of this message causing x to become unstable is bounded by the event that
x appears in more than 5d clauses. As x is expected to appear in 3.5d clauses, the latter event happens
w.p. e−Θ(d) (again using Chernoff’s bound). To sum up,

                            Pr[x is unstable ] ≤ 5d · e−Θ(d) + e−Θ(d) = e−Θ(d) .

The bound on E[FUNSTAB ] follows by linearity of expectation.
      We are now left with proving that FUNSTAB is concentrated around its expectation, we do so using a
martingale argument. Define two new random variables, F1 counting the number of unstable variables x
s. t. there exists a clause C, containing x, and another variable y, such that y appears in more than log n
clauses, and F2 to be the unstable variables such that in all clauses in which they appear, all the other
variables appear in at most log n clauses. Observe that FUNSTAB = F1 + F2 . To bound F1 , observe that if
F1 ≥ 1, then in particular this implies that there exists a variable which appears in more than log n clauses
in F. This happens with probability o(1) since every variable is expected to appear only in O(d) clauses
(using Chernoff’s bound). To bound F2 we use a martingale argument in the constellation of [6], page 101.
We use the clause-exposure martingale (the clause-exposure martingale implicitly includes the random
ordering π, since one can think of the following way to generate the random instance—first randomly
shuffle all possible clauses, and then toss the coins). The exposure of a new clause C can change F2 by at
most 6 log n since every variable in C appears in at most log n clauses, namely with at most 2 log n other
variables that might become (un)stable due to the new clause. The martingale’s total variance, in the
                                                                                      √
terminology of Theorem 3.2, is σ 2 = Θ(dn log2 n). Theorem 3.2, with α = e−Θ(d) n/ log n, and the fact
that E[F2 ] ≤ E[FUNSTAB ], implies concentration around the expectation of F2 .

     Let α ∈ {0, 1}3#F be a clause-variable message vector. For a set of clause-variable message edges E
let 1α (E) be the set of edges along which the value is 1 according to α. For a set of clauses C, 1α (C)
denotes the set of clause-variable message edges in the factor graph of F containing a clause from C as
one endpoint and along which the value is 1 in α.

                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                            626
                  C ONVERGENCE OF M ESSAGE PASSING FOR S ATISFIABILITY P ROBLEMS

Definition 3.7. A variable x is violated by α in π if there exists a message C → x, C = (`x ∨ `y ∨ `z ), in
place i in π such that one of the following holds:

   1. |#1α (π −i (y) ∩ N ++ (y)) − #1α (π −i (ȳ) ∩ N − (y))| > d/30

   2. |#1α (N ++ (y)) − #1α (N − (y))| > d/30

   3. #1α (N s (y)) < d/7.

Or one of the above holds for z.

Proposition 3.8. Let F be a 3CNF formula randomly sampled according to Pplant                            2
                                                                                     n,p , where p ≥ d/n , d a
sufficiently large constant. Let X be the set of stable variables w. r. t. an arbitrary ordering π. Let α be a
random clause-variable message vector. Let FVIO be a random variable counting the number of violated
variables in X. Then w. h. p. FVIO ≤ e−Θ(d) n.

Proof. If #X ≤ e−Θ(d) n then we are done, as FVIO ≤ #X. Therefore, assume #X = e−Θ(d) n. As in the proof
of Proposition 3.6, we first bound E[FVIO ], and then prove concentration using a martingale argument.
Consider a stable variable x in F w. r. t. to an ordering π of the clause-variable messages. Let α be a
random assignment to the clause-variable messages. Consider a clause-variable message edge C → x at
location i in π. x is stable and therefore

                             |#π −i (y) ∩ N ++ (y) − #π −i (y) ∩ N − (y)| ≤ d/30 .

Since α is a random assignment

                       E[|#1α (π −i (y) ∩ N ++ (y)) − #1α (π −i (ȳ) ∩ N − (y))|] ≤ d/60 .

If however
                        |#1α (π −i (y) ∩ N ++ (y)) − #1α (π −i (ȳ) ∩ N − (y))| > d/30 ,                (3.1)
then at least one of the quantities in (3.1) deviated from its expectation by at least (d/30 − d/60)/2 =
d/120. Since both quantities are binomially distributed with expectation O(d), the probability of the
latter happening is e−Θ(d) , using standard concentration results.
     Properties (2) and (3) are bounded similarly using the Chernoff bound. Using the union bound as in the
proof of Proposition 3.6 and the linearity of expectation, one obtains that E[FVIO ] ≤ e−Θ(d) #X = e−Θ(d) n.
Finally, repeat the martingale argument in Proposition 3.6 (instead of a clause-exposure martingale, we
have a clause-variable message values exposure martingale). Both martingales involve Θ(n) random
variables, and therefore the same exact calculation holds in the case of the current proposition as well
(maybe with different constants, which are accounted for in the Θ-notation).

3.3   Dense subformulas
The next property we discuss is analogous to a property proved in [4] for random graphs. Loosely
speaking, a random graph typically does not contain a small induced subgraph with a large average
degree. A similar proposition for 3SAT can also be found in [18].

                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                              627
                        U RIEL F EIGE , E LCHANAN M OSSEL , AND DAN V ILENCHIK

Proposition 3.9. Let c > 1 be an arbitrary constant. Let F be a 3CNF formula randomly sampled
according to Pplant              2
              n,p , where p ≥ d/n , d a sufficiently large constant. Then w. h. p. there exists no subset of
variables U, such that #U ≤ e−Θ(d) n and there are at least c#U clauses in F containing two variables
from U.
Proof. For a fixed set U of variables, #U = k, the number of clauses containing two variables from U is
                                          
                                          k
                                             (n − 2)23 ≤ 4k2 n .
                                          2

Each of these clauses is included independently w. p. d/n2 . Thus, the probability that ck of them are
included is at most        2   ck  2                ck 
                                                                 12kd ck
                                                                       
                            4k n     d        4k ne d
                                          ≤          · 2     ≤             .
                             ck     n2          ck    n            cn
Using the union bound, the probability there exists a “dense” set U is at most
                                 e−Θ(d) n              ck
                                               n   12kd
                                   ∑                            = O(d 2c /n2c−2 ) .
                                   k=2         k    cn

The last equality
                is obtained using standard calculations, and the standard estimate on the binomial
coefficient: nk ≤ (en/k)k .

3.4   The core variables
We describe a subset of the variables, denoted throughout by H and referred to as the core variables,
which plays a crucial role in the analysis. The notion of a stable variable is not enough to ensure that the
algorithm will set a stable variable according to the planted assignment, as it may happen that a stable
variable x appears in many of its clauses with unstable variables. Thus, x can be biased in the wrong
direction (by wrong we mean disagreeing with the planted assignment). However, if most of the clauses
in which x appears contain only stable variables, then this is already a sufficient condition to ensure
that x will be set correctly by the algorithm. The set H captures the notion of such variables. There are
several ways to define a set of variables with these desired properties, we present one of them, and give a
constructive way of obtaining it (though it has no algorithmic implications, at least not in our context).
Formally, H = H(F, ϕ, α, π) is constructed using the following iterative procedure:
Proposition 3.10. If both α and π are chosen uniformly at random then with probability 1 − n−Ω(d) ,
#H = (1 − e−Ω(d) )n.
Proof. Let H̄ = V \ H. Set δ = e−Θ(d) . Partition the variables in H̄ into variables that belong to A1 ∪ A2 ∪
A3 , and variables that were removed in the iterative step, H̄ it = H0 \ H. If #H̄ ≥ δ n, then at least one of
A1 ∪ A2 ∪ A3 , H̄ it has cardinality at least δ n/2. Consequently,

         Pr[#H̄ ≥ δ n] ≤ Pr[#A1 ∪ A2 ∪ A3 ≥ δ n/2] + Pr[#H̄ it ≥ δ n/2 #A1 ∪ A2 ∪ A3 ≤ δ n/2] .
                         |          {z           } |                    {z                  }
                                         (a)                                          (b)



                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                              628
                  C ONVERGENCE OF M ESSAGE PASSING FOR S ATISFIABILITY P ROBLEMS

      Let A1 be the set of variables whose support w. r. t. ϕ is at most d/3.
      Let A2 be the set of non-stable variables w. r. t. π.
      Let A3 be the set of stable variables w. r. t. π which are violated by α.

         1. Set H0 = V \ (A1 ∪ A2 ∪ A3 ).

         2. While there exists a variable ai ∈ Hi which supports less than d/4 clauses in F[Hi ] OR
            appears in more than d/30 clauses not in F[Hi ] define Hi+1 = Hi \ {ai }.

         3. Let am be the last variable removed in step 2. Define H = Hm+1 .



Propositions 3.4, 3.6, and 3.8 are used to bound (a). To bound (b), observe that every variable that is
removed in iteration i of the iterative step (step 2), supports at least (d/3 − d/4) = d/12 clauses in which
at least another variable belongs to {a1 , a2 , . . . , ai−1 } ∪ A1 ∪ A2 ∪ A3 , or appears in d/30 clauses each
containing at least one of the latter variables. Consider iteration δ n/2. Assuming #A1 ∪ A2 ∪ A3 ≤ δ n/2,
by the end of this iteration there exists a set containing at most δ n variables, and there are at least
d/30 · δ n/2 · 1/3 clauses containing at least two variables from it (we divide by 3 as every clause might
have been counted 3 times). Plugging c = d/180 in Proposition 3.9, (b) is bounded. Finally observe that
(a) occurs with exponentially small probability, and (b) occurs with probability n−Ω(d) .

3.5    The factor graph of the non-core variables
Proposition 3.10 implies that for p = c log n/n2 , c a sufficiently large constant, w. h. p. H contains already
all variables. Therefore the following propositions are relevant for the setting of Theorem 1.1 (namely,
p = O(1/n2 )). In what follows we establish the typical structure of the factor graph induced on the
non-core variables.
Proposition 3.11. Let F be a 3CNF formula randomly sampled according to Pplant                      2
                                                                                 n,p , where p ≥ d/n , d a
sufficiently large constant, let π be the initial random ordering of the clause-variable messages, and α
the initial random clause-variable message vector. Let T be the factor graph induced on the non-core
variables. T enjoys the following properties:
   1. Every connected component contains w. h. p. O(log n) variables,
   2. every connected component contains w. h. p. at most one cycle,
   3. the probability of a cycle of length at least k in T is at most e−Θ(dk) .
    A proposition of similar flavor to (1) was proven in [18] though with respect to a different notion of
core. This alone would not have sufficed to prove our result, and we need (2) and (3) as well.
Corollary 3.12. Let f = f (n) be such that f (n) → ∞ as n → ∞. Then w. h. p. there is no cycle of length
f (n) in the non-core factor graph.
   The proof of Proposition 3.11 is quite long and technical. To avoid distraction, we go ahead and
prove Theorem 1.1 and Proposition 1.3, and defer the proof of Proposition 3.11 to Section 5.

                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                               629
                         U RIEL F EIGE , E LCHANAN M OSSEL , AND DAN V ILENCHIK

4     Proof of Theorem 1.1 and Proposition 1.3
We start by giving an outline of the proof of Theorem 1.1. Proposition 1.3 is derived as an easy corollary
of that proof. To prove Theorem 1.1 we need to establish three properties:

    1. Convergence: the WP algorithm converges to a fixed point.
    2. Consistency: the partial assignment implied by this fixed point is consistent with some satisfying
       assignment.
    3. Simplicity: after simplifying the input formula by substituting in the values of the assigned variables,
       the remaining subformula is not only satisfiable (this is handled by consistency), but also simple.

     We assume that the formula F and the execution of WP are typical in the sense that Propositions 3.10
and 3.11 hold. First we prove that after one iteration WP sets the core variables H correctly (Bi agrees
with ϕ in sign) and this assignment does not change in later iterations. The proof of this property is rather
straightforward from the definition of a core. This establishes convergence and consistency for the core
variables. From iteration 2 onwards WP is basically running on F in which variables belonging to H
are substituted with their planted assignment. This subformula is satisfiable. Moreover, its factor graph
contains small (logarithmic size) connected components, each containing at most one cycle. This last
fact serves a dual purpose. It shows that if WP will eventually converge, the simplicity property will
necessarily hold. Moreover, it will assist us in proving convergence and consistency for the subformula.
Consider a connected component composed of a cycle and trees “hanging” on the cycle. Proving
convergence on the trees is done using a standard inductive argument. The more interesting part is proving
convergence on the cycle. The difficulty there is that messages on a cycle may have more than one fixed
point to which they may possibly converge, which makes it more difficult to prove that they converge at
all. Our proof starts with a case analysis that identifies those cases that have multiple fixed points. For
these cases we prove that almost surely random fluctuations caused by step 3.a of the WP algorithm will
lead to convergence to some fixed point. This is similar in flavor to the fact that a random-walk on a line
eventually reaches an endpoint of the line (even though one cannot tell a-priori which endpoint this will
be). Hand-in-hand with establishing convergence for the trees and cycle, we shall also prove consistency.
     The set VA of Theorem 1.1 is composed of all variables from H and those variables from the non-core
factor graph that get assigned. The set VU is composed of the UNASSIGNED variables from non-core
factor graph. We now proceed with the formal proof.

4.1    Analysis of WP on the core factor graph
We start by proving that the messages concerning the factor graph induced by the core-variables converge
to the correct value, and remain the same until the end of the execution.
    We say that a message C → x, C = (`x ∨ `y ∨ `z ), is correct if its value is the same as it is when y → C
and z → C are 1 (that is agree in sign with their planted assignment). In other words, C → x is 1 iff
C = (x ∨ ȳ ∨ z̄) (x supports C).
Proposition 4.1. If xi ∈ H and all messages C → xi , C ∈ F[H] are correct at the beginning of an iteration
(line 3 in the WP algorithm), then this invariant is kept by the end of that iteration.

                      T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                              630
                  C ONVERGENCE OF M ESSAGE PASSING FOR S ATISFIABILITY P ROBLEMS

Proof. By contradiction, let C0 → x be the first wrongly evaluated message in the iteration, C0 =
(`x ∨ `y ∨ `z ). Then at least one of y, z sent a wrong message to C0 .

                            y → C0 =              ∑             C → y−                ∑            C0 → y .
                                           C∈N + (y),C6=C0                   C0 ∈N − (y),C0 6=C0

Every message C00 → y, C00 ∈ F[H] ∩ {N ++ (y) ∪ N − (y)} is 0 (since it was correct at the beginning of
the iteration and that didn’t change until evaluating C0 → x). On the other hand, y ∈ H and therefore it
supports at least d/4 clauses in F[H]. Thus at least (d/4 − 1) messages in the left hand sum are “1” (we
subtract 1 as y might support C0 ). y appears in at most d/30 clauses with non-core variables (all of which
may contribute a wrong “1” message to the right hand sum). All in all, y → C0 ≥ (d/4 − d/30 − 1) > d/5,
which is correct (recall, we assume ϕ = 1n ). The same applies for z, contradicting our assumption.

Proposition 4.2. If xi ∈ H and all messages C → xi , C ∈ F[H] are correct by the end of a WP iteration,
then Bi agrees in sign with ϕ(xi ) by the end of that iteration.
Proposition 4.2 follows immediately from the definition of H and the message Bi . It suffices to show then
that after the first iteration all messages C → xi , C ∈ F[H] are correct.
Proposition 4.3. If F is a typical instance in the setting of Theorem 1.1, then after one iteration of WP(F),
for every variable xi ∈ H, every message C → xi , C ∈ F[H] is correct.
Proof. The proof is by induction on the order of the execution in the first iteration. Consider the first
message C → x, C = (`x ∨ `y ∨ `z ), C ∈ F[H], to be evaluated in the first iteration. Now consider the
message y → C at the time C → x is evaluated. All messages C0 → y, C0 ∈ F[H] have their initial random
value (as C → x is the first core message to be evaluated). Furthermore, y ∈ H, and therefore there are at
most d/30 messages of the form C00 → y, C00 ∈ / F[H]. x ∈ H hence it is stable w. r. t. π and not violated by
the initial clause-variable random messages. Therefore

         y→C       ≥              d/7                   −              d/30                  −        d/30           >   d/14 .
                                  |{z}                                 | {z }                         | {z }
                       property (3) in Definition 3.7       property (2) in Definition 3.7       non-core messages

The same applies to z, to show that C → x is correct. Now consider a message C → x at position i, and
assume all core messages up to this point were evaluated correctly. Observe that every core message
C0 → y that was evaluated already, if C0 ∈ {N ++ (y) ∪ N − (y)} ∩ F[H] then its value is “0” by the induction
hypothesis. Since x is not violated by α, property (2) in Definition 3.7 ensures that to begin with
|#1α (N ++ (y)) − #1α (N − (y))| ≤ d/30. y ∈ H, therefore it appears in at most d/30 non-core messages,
all of which could have been already wrongly evaluated, changing the above difference by additional
d/30. As for the core messages of y which were already evaluated, since they were evaluated correctly,
property (1) in Definition 3.7 ensures that the above difference changes by at most additional d/30. All
in all, by the time we evaluate C → x,

                                ∑              C0 → y −              ∑              C00 → y ≥ −3 · d/30 .
                         C0 ∈N ++ (y),C0 6=C                 C00 ∈N − (y),C00 6=C

As for messages that y supports, property (3) in Definition 3.7 ensures that their contribution is at least
d/7 to begin with. Every core message in N s (y) that was evaluated turned to “1,” every non-core message

                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                                                      631
                        U RIEL F EIGE , E LCHANAN M OSSEL , AND DAN V ILENCHIK

was already counted in the above difference. Therefore y → C ≥ d/7 − 3 · d/30 > d/25. The same applies
to z showing that C → x is correct.

   To prove Proposition 1.3, observe that when p = c log n/n2 , with c a sufficiently large constant,
Proposition 3.10 implies H = V . Combining this with Proposition 4.3, Proposition 1.3 readily follows.

4.2   The effect of messages that already converged
It now remains to analyze the behavior of WP on the non-core factor graph, given that the messages
involving the core factor graph have converged correctly. A key observation is that once the messages
in the factor graph induced by the core variables converged, we can think of WP as if running on the
formula resulting from replacing every core variable with its planted assignment and simplifying (which
may result in a 1-2-3CNF). The observation is made formal by the following proposition:
Proposition 4.4. Consider a run of WP that has converged on the core. Starting at some iteration
after WP has converged on the core, consider two alternative continuations of the warning propagation
algorithm. WP1 denotes continuing with WP on the original input formula. WP2 denotes continuing
with WP on the formula obtained by replacing each core variable with its planted assignment and
simplifying. Then for every iteration t, the sequence of messages in the t’th iteration of WP2 is identical
to the respective subsequence in WP1 . (This subsequence includes those messages not involving the core
variables, and includes messages of type x → C and of the type C → x.)
Proof. First note that all messages x → C, x ∈ H, do not change (sign) from the second iteration onwards
(by the analysis in the proof of Proposition 4.3). Furthermore, if `x satisfies C in ϕ, then x → C is positive
(if x is a true literal in C, or negative otherwise), and therefore all messages C → y, y 6= x are constantly
0. Namely, they don’t affect any calculation, and this is as if we replaced `x with TRUE, and in the
simplification process C disappeared. If `x is false in C under ϕ, then x → C is constantly negative (if
`x = x, or constantly positive if `x = x̄), and this is exactly like having `x removed from C (which is the
result of the simplification process).


4.3   Analysis of WP on the non-core factor graph
Note that to prove the convergence of the algorithm we need also to prove that messages of the sort C → x
where C is not in the core and x is in the core converge. However, if we prove that all messages in the
factor graph induced by the non-core variables converge, then this (with the fact that the core factor graph
messages converge) immediately implies the convergence of messages of this type. Therefore, our goal
reduces to proving convergence of WP on the factor graph induced by F|ψ , where ψ assigns the core
variables their planted assignment, and the rest are UNASSIGNED.
    We say that WP converged correctly in a connected component C of the non-core factor graph if there
exists a satisfying assignment ψ of the entire formula which is consistent with ϕ on the core, and with
the assignment of WP to C.
   Consider a connected component in the non-core factor graph consisting of a cycle with trees hanging
from it. Our analysis proceeds in three steps:

                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                              632
                  C ONVERGENCE OF M ESSAGE PASSING FOR S ATISFIABILITY P ROBLEMS

   1. We first prove that clause-variable and variable-clause messages of the form α → β where α → β
      lead from the trees to the cycle, converge correctly w. r. t. the planted assignment. In the case that
      the component has no cycles, this concludes the proof.

   2. Then, using a refined case analysis, we show that the messages along the cycle also converge w. h. p,
      this time not necessarily to the planted assignment, but to some satisfying assignment which agrees
      with the already converged messages.

   3. Finally, we conclude by showing that messages from the cycles to the trees converge. Moreover,
      this will imply that convergence will be to values consistent with the values converged to in step (1),
      and that hence that all messages in the connected component converge correctly according to some
      satisfying assignment.

     Consider the factor graph F induced by the simplified formula. A cycle in F is a collection
x1 ,C2 , x3 ,C4 , . . . , xr = x1 where xi and xi+2 belong to Ci+1 for all i (in our description we consider
only odd values of i) and xi 6= xi+2 , Ci+1 6= Ci+3 for all i. A factor graph F is a tree if it contains no cycles.
It is unicyclic if it contains exactly one cycle. Let x → C be a directed edge of F. We say that x → C
belongs to the cycle, if both x and C belong to the cycle. For an edge x → C that does not belong to the
cycle, we say that x → C is directed towards the cycle if x does not belong to the cycle and C lies on the
simple path from x to the cycle. We say that the edge x → C is directed away from the cycle if C does not
belong to the cycle and x lies on the simple path from the cycle to C. Similarly we define what it means
for an edges C → x to belong to the cycle, to be directed towards the cycle and to be directed away from
the cycle.

Proposition 4.5. Let F be a unicyclic factor graph. Then every directed edge of the form x → C or C → x
either belongs to the cycle, or is directed towards it or directed away from it.

Proof. Recall that the factor graph is an undirected graph, and the direction is associated with the
messages. Take an edge x → C (similarly for C → x), if it lies on the cycle, then we are done. Otherwise,
since the factor graph is connected, consider the path in the tree leading from some element of the cycle
to C. This path is either contained in the path to x or contains it (otherwise there is another cycle). In the
first case x → C is directed towards the cycle, and in the latter x → C is directed away from the cycle.

   Our analysis proceeds in two parts: first we shall analyze WP on the trees, then WP on the cycle and
connect the two (which is relevant for the uni-cyclic components).

4.4 WP on the trees
As we already mentioned before, there are two directions to consider: messages directed towards the
cycle and away from the cycle. In this section we shall consider a rooted tree, and partition the messages
according to messages which are oriented away from the root (they will correspond in the sequel to
messages going away from the cycle) and messages towards the root (messages from the leaves towards
the root—later to be identified with messages going into the cycle). The first lemma concerns messages
going towards the root.

                      T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                                 633
                        U RIEL F EIGE , E LCHANAN M OSSEL , AND DAN V ILENCHIK

Remark 4.6. Lemma 4.7 is a special case of the known fact (see [7] for example) that for every tree
induced by a satisfiable formula, WP converges and there exists a satisfying assignment ψ such that
every Bi is either 0 or agrees with ψ. In Lemma 4.7 we assume that the formula is satisfiable by the all 1
assignment (the planted assignment), and consider only messages towards the root.
Lemma 4.7. Let C → x be an edge in the non-core factor graph belonging to a connected component of
size s, and in particular to a rooted tree T . If C → x is directed towards the root then the message C → x
converges after at most O(s) iterations. Furthermore, if C → x = 1 then x appears positively in C.
Proof. We consider the case C = (`x ∨ `y )—the case C = (`x ∨ `y ∨ `z ) where all three literals belong to
non-core variables is proved similarly. For an edge (C, x) in the factor graph, we define level(C, x) to
be the number of edges in a path between C and the leaf most distant from C in the factor graph from
which the edge (C, x) is removed. The lemma is now proved using induction on the level i. Namely, after
the i0th iteration, all messages C → x associated with an edge (C, x) at level i converge, and if C → x = 1
then x appears positively in C.
    The base case is an edge (C, x) at level 0. If level(C, x) = 0 then C is a unit clause containing only
the variable x. By the definition of the messages, in this case C → x = 1 and indeed it must be the case
that x is positive in C (as the other two variables evaluate to FALSE under the planted). Now consider
an edge (C, x) at level i, and consider iteration i. Since i > 0, it must be that there is another non-core
variable y in C (or two more variables y, z). Consider an edge (C0 , y), y ∈ C0 (if no such C0 exists that we
are done as C → x will be constantly 0 in this case).
    level(C0 , y) is strictly smaller than i since every path from C to a leaf (when deleting the edge (C, x))
passes through some edge (C0 , y). By the induction hypothesis, all messages C0 → y already converged,
and therefore also y → C and in turn C → x. It is only left to take care of the case C → x = 1. In this
case, there must be a clause C0 such that C0 → y = 1 and y appears positively in C0 (by the induction
hypothesis). If C → x = 1 it must be that y appears negatively in C and therefore x must appear positively
(otherwise C is not satisfied by the planted assignment).

    Next we consider several scenarios that correspond to messages going form the root towards the
leaves. Those scenarios correspond to step (3) of our analysis, referred to in Section 4.3.
Claim 4.8. Assume that F is a unicyclic formula. Assume further that WP has converged on F. Let
x → C be directed away from the cycle. Let FC be the subformula inducing the tree rooted at C, while
x itself is removed from C. This formula contains all clauses whose path to the cycle goes via x and a
clause corresponding to C where the literal corresponding to the variable x is removed. Then C → x = 0
in the fixed point if and only if FC is satisfiable.
Proof. The structure of the proof is similar to that of Lemma 4.7. For convenience we extend the
definition of level above as to include edges on the cycle. We say that an edge (C, x) in the factor graph
has level(C, x) equal ∞ if (C, x) lies on a cycle and level(C, x) = t < ∞ if t is the maximal length of a
path between C and a leaf in the factor graph from which the edge (C, x) is removed. The lemma is now
proved using induction on t.
    The base case is an edge (C, x) at level 0. If level(C, x) = 0 then C is a unit clause containing only
the variable x, and then FC is the empty formula. Indeed C → x = 1 by definition and FC is unsatisfiable
(by definition again, the empty formula is not satisfiable).

                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                              634
                  C ONVERGENCE OF M ESSAGE PASSING FOR S ATISFIABILITY P ROBLEMS

     Now consider an edge (C, x) at level t > 0. Assume C = (x ∨ `y ∨ `z ) (maybe only `y ). Observe that
every edge (C j , y) in FC has level value which is strictly smaller than t—since every path from C to a
leaf (when deleting the edge (C, x)) passes through some edge (C j , y). First we prove that if FC is not
satisfiable then it must be that C → x = 1. If FC is not satisfiable then it must be that `y = ȳ and `z = z̄
(otherwise, if one of them is positive then ϕ satisfies FC ). Further, observe that there must exist at least
one clause C j containing y positively, and at least one Ck containing z positively such that FCi and FC j
are unsatisfiable. Otherwise, we can define ϕ 0 to be ϕ except that y or z are assigned FALSE (depending
which of Ci or C j does not exist). It is easy to see that ϕ 0 satisfies FC , contradicting our assumption. By
the induction hypothesis Ci → y = 1 and C j → z = 1. This in turn implies that C → xi = 1 (since Ci and
C j contain y and z respectively in an opposite polarity to C and there cannot be any message Ck → y = 1
where y appears negatively in Ck since FCk is satisfiable in this case). Now assume that C → x = 1. The
same arguments imply that C must be of the form C = (x ∨ ȳ ∨ z̄) and there exists Ci ,C j as above. By the
induction hypothesis, if one is to satisfy FC it must be that y = z = TRUE (this is the only way to satisfy
FCi and FC j when inserting y back to Ci and z to C j ), but then C is not satisfied.

    Recall our definition for WP converges correctly on a subformula F 0 of F if there exists a satisfying
assignment ψ to F which is consistent with the planted assignment ϕ on the core, and with the assignment
of WP to F 0 .
Claim 4.9. Assume that F is a unicyclic formula. Assume further that WP has converged on F. Let
C → y be directed away from the cycle. Consider a subformula FC which induces a tree rooted at a clause
C. This formula contains the clause C and all other clauses whose path to the cycle goes via y. If in the
fixed point for F it holds that
    • C → y = 1,
    • y appears negatively in C,
    • y → C ≥ 1,
then WP converge correctly on FC .
Proof. Let us track the origin of the message y → C in the tree (which is directed towards the root).
For y → C ≥ 1 to occur, there must be a clause D1 in the tree that contains y positively and a message
D1 → y = 1 in the direction of the root (as messages in the direction of the root are only affected by
other messages in that direction). Let us backtrack one more step. D1 = (y ∨ k̄ ∨ w̄) for some variables
w, k; k and w must appear negatively in D1 by Lemma 4.7), and the fact that D1 → y = 1, that is both
k and w were issued warnings having them not satisfy D1 . Let us consider the clauses D2 and D3 that
issues warnings to k and w respectively. D2 = (k ∨ . . .), D3 = (w ∨ . . .), D2 → k = 1 and D3 → w = 1, and
both messages are directed towards the root. Obviously, one can inductively continue this backtracking
procedure which terminates at the leaves of the tree (since there are not cycles the procedure is well
defined and always terminates). Let us call the clauses and variables that emerge in this backtrack the
spine of the tree (see Figure 1). The figure below illustrates this procedure.
    For the subtree corresponding to the spinal variable we show that all messages that point away from
the root converge in a consistent way with planted assignment. This combined with Lemma 4.7 completes
that part of the proof.

                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                              635
                         U RIEL F EIGE , E LCHANAN M OSSEL , AND DAN V ILENCHIK




                                       Figure 1: The spine of a tree

    Let k be some variable that belongs to the spine, and let Fk be the subformula corresponding to the
tree hanging from k (and we think of the messages along that tree oriented away from the cycle). We
claim that k → C ≥ 0 for every C in Fk . This is proven via induction on the distance of the variable from
y. The base case is distance 0, which is y itself. The messages that we need to verify are of the form
y → C0 , C0 6= D1 , which are pointing away from the cycle. In this case y → C0 ≥ 0 (y’s message agrees
with the planted); this is because the wrong message Ci+1 → y is evened by the correct warning D1 → y,
and y → C0 depends only on one message which is directed away from the cycle—otherwise there is a
second cycle. The induction step follows very similarly. This fact guarantees correct convergence of the
variables in the trees hanging from spinal variables. Finally, for the spinal variables, there is always at
most one message in the direction away from the cycle (otherwise there is more than one cycle); this
message may be wrong. Using a very similar inductive argument one can show that there is always at
least one correct warning (in the direction of the cycle), therefore Bw ≥ 0 for every spinal variable w.
    As for the non-spinal parts of the tree that hang on y, say a clause M 6= D1 ,C, then y → M ≥ 0 since
the wrong message of C is evened by the correct warning of D1 (and there may be other correct warnings
from messages in the direction of the cycle). Since there is a satisfying assignment of that subtree which
is consistent with By ≥ 0, Remark 4.6 can be applied to guarantee correct convergence.

4.5 WP on cycles
We will denote a cycle by x1 ,C2 , x3 ,C4 . . . , x2r−1 ,C2r , x1 where by this we mean that xi appears in the
clauses before/after it and that Ci contains the two variables before/after it. We consider two different
types of cycles.
    • Biased cycles: cycles that have at least one warning message C → xi = 1 coming into the cycle,

                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                              636
                   C ONVERGENCE OF M ESSAGE PASSING FOR S ATISFIABILITY P ROBLEMS

        where C → xi directs into the cycle and the value of C → xi is the value after the edge has converged.

    • Free cycles: cycles that do not have such messages coming in, or all messages coming in are 0
      messages.


4.5.1    Convergence of WP when the cycle is biased

First we observe that we may assume that edges that enter the cycle enter it at a variable rather than at a
clause (hence that every clause on the cycle contains exactly two non-core variables). This is because of a
simple argument similar to Proposition 4.4: consider an edge going into the cycle, z → C, and assume that
z appears positively in C. After all the edges going into the cycle have converged, if z → C ≥ 0 it follows
that C → x = 0 for cycle edges (C, x), and thus execution on the cycle is the same as if C was removed
from the formula, only now we are left with a tree, for which convergence to a correct assignment is
guaranteed (Remark 4.6). If z → C < 0, then the execution is exactly as if z was removed from C (and C
is in 2-CNF form).

Proposition 4.10. Let C be a connected component of the factor graph of size s containing one cycle
such that there exists an edge directed into the cycle C → xi where xi belongs to the cycle and such that
the message converges to C → xi = 1. Then WP converges on C after at most O(s) rounds. Moreover for
the fixed point, if the message C0 → x = 1 then x appears positively in C0 .

Proof. A message of the cycle C j → x j+1 depends only on cycle messages of the type C j0 → x j0 +1 ,
x j0 +1 → C j0 +2 and on messages coming into the cycle. In other words during the execution of WP the
values of all messages C j0 → x j0 −1 , x j0 −1 → C j0 −2 do not affect the value of the message C j → x j+1 .
Recall that we are in the case where there exists a message C → xi = 1 going into the cycle (after the
convergence of these messages). Also xi must appear positively in C. We consider the following cases:

    • There exists a variable x j that appears positively in both C j−1 and C j+1 (the case j = i is allowed
      here). We note that in this case the message x j → C j+1 must take the value either 0 or 1 which
      implies that the message C j+1 → x j+2 converges to the value 0. This in turn implies that the value
      of all messages xr → Cr+1 and Cr+1 → xr+2 for r 6= j will remain the same if the clause C j+1 is
      removed from the formula. However, this case reduces to the case of tree formula.

    • xi appears negatively in Ci+1 and positively in Ci−1 . We note that in this case the message xi → Ci+1
      always take the value 1 which implies that the message Ci+1 → xi+2 always take the value 1. Thus
      in this case we may remove the clause Ci+1 from the formula and replace it by the unit clause `y
      where Ci+1 = `y ∨ x̄i . Again, this reduces to the case of a tree formula.

    • The remaining case is the case where xi appears negatively in both Ci−1 and Ci+1 and there is no j
      such that x j appears positively in both C j−1 and C j+1 . We claim that this leads to contradiction. Note
      that by the lemma above there exists a satisfying assignment where xi = 1. Write Ci+1 = x̄i ∨ `i+2 .
      Then for the truth assignment we must have `i+2 = 1, similarly `i+4 = 1 etc. until we reach x̄i = 1—a
      contradiction.

                      T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                              637
                        U RIEL F EIGE , E LCHANAN M OSSEL , AND DAN V ILENCHIK

To summarize, by Lemma 4.7 the messages going into the rooted tree at xi converge after O(s) steps, and
at least one warning is issued. By the above discussion, for every clause D in the connected component it
holds that xi → D ≥ 0 (as xi appears in at most one message which may be wrong—a cycle message).
Since there is always a satisfying assignment consistent with xi assigned TRUE, then after reducing the
cycle to a tree we are left with a satisfiable tree. Remark 4.6 guarantees convergence in additional O(s)
iterations.

4.5.2   Convergence of WP when the cycle is free
The main result of this subsection is summarized in the following claim:

Claim 4.11. Let C be a connected component of the factor graph of size s containing one cycle of size r
such that the fixed point contains no messages C → x = 1 going into the cycle (the cycle is free). Then
w. h. p. WP converges on C after at most O(r2 · log n + s) rounds. Moreover for the fixed point, if we
simplify the formula which induces C according to the resulting Bi ’s, then the resulting subformula is
satisfiable.

Remark 4.12. Observe that the free case is the only one where convergence according to the planted
assignment is not guaranteed. Furthermore, the free cycle case is the one that may not converge “quickly”
(or not at all), though this is extremely unlikely. The proof of Proposition 4.11 is the only place in the
analysis where we use the fact that in line 3.a of WP we use fresh randomness in every iteration.

    We consider two cases: the easy one is the case in which the cycle contains a pure variable w. r. t the
cycle (though this variable may not be pure w. r. t to the entire formula).

Claim 4.13. If the cycle contains a variable xi appearing in the same polarity in both Ci+1 ,Ci−1 , then the
messages C → x along cycle edges converge. Moreover for the fixed point, if C → x = 1 then x satisfies C
according to ϕ.

    The proof is essentially the first case in the proof of Proposition 4.10. We omit the details.
We now move to the harder case, in which the cycle contains no pure variables (which is the case referred
to in Remark 4.12).

Proposition 4.14. Consider a free cycle of size r with no pure literal, and one of the two directed cycles
of messages. Then the messages along the cycle converge w. h. p. to either all 0 or all 1 in O(r2 log n)
rounds.

    Convergence w. h. p. in polynomial time suffices due to Corollary 3.12 (which asserts that w. h. p.
every cycle is of length at most logε n for every ε > 0). The proof of Proposition 4.14 is given in the end
of this section. We proceed by analyzing WP assuming that Proposition 4.14 holds, which is the case
w. h. p.

Proposition 4.15. Suppose that the cycle messages have converged (in the setting of Proposition 4.14),
then the formula resulting from substituting every xi with the value assigned to it by Bi (according to the
fixed point of WP), and simplifying, is satisfiable.

                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                            638
                  C ONVERGENCE OF M ESSAGE PASSING FOR S ATISFIABILITY P ROBLEMS

Proof. Let F be the subformula that induces the connected component C, and decompose it according to
the trees that hang on the cycle’s variables and the trees that hang on the cycle’s clauses. Observe that the
formulas that induce these trees are variable and clause disjoint (since there is only one cycle in the C).
     Let us start with the cycle clauses. The key observation is that setting the cycle variables according
to one arbitrary orientation (say, set xi to satisfy Ci+1 ) satisfies the cycle and does not conflict with any
satisfying assignment of the hanging trees: if the tree hangs on a variable xi , then since the cycle is free,
the tree is satisfiable regardless of the assignment of xi (Proposition 4.8). In the case that the tree hangs
on a cycle-clause C, then the cycle variables and the tree variables are disjoint, and C is satisfied already
by a cycle-variable regardless of the assignment of the tree-variables. Now how does this coincide with
the result of WP. Recall that we are in the case where the cycle is free. Therefore only messages C → xi
where both C and xi belong to the cycle affect Bi . If in the fixed point one cycle orientation is 0 and
one orientation is 1, then the Bi messages of the cycle variables implement exactly this policy. If both
cycle orientations converged to 1 or to 0, then the corresponding Bi messages of all cycle variables are
UNASSIGNED (since the cycle is free), but then the same policy can be used to satisfy the clauses of the
cycle in a manner consistent with the rest of the formula.
     It remains to show that WP converges on every tree in a manner that is consistent with some satisfying
assignment of the tree. We consider several cases.
     Consider a tree hanging on a cycle variable xi . Let C be some non-cycle clause that contains xi , and
FC the subformula that induces the tree rooted at C. Observe that once the cycle has converged, then the
message xi → C does not change anymore. If xi → C agrees with ϕ there are two possibilities. Either xi
satisfies C under ϕ, in which case C always sends 0 to FC , and then WP executes on FC as if C is removed.
Remark 4.6 guarantees correct convergence (as FC \C is satisfiable), and as for C, Bi ≥ 0 and we can set
xi to TRUE so that it satisfies C and is consistent with the assignment of the cycle (Bi ≥ 0 since xi ≥ 0
and C → xi = 0 as we are in the free cycle case). If xi appears negatively in C, then WP executes as if xi
was deleted from C. Still FC is satisfiable and correct convergence is guaranteed.
     Now consider the case where xi → C disagrees with ϕ. Recall that we assume ϕ(xi ) = TRUE, and
therefore x → C is negative in the fixed point. If xi appears negatively in C then C → y = 0 for every
y ∈ C (since xi signals C that it satisfies it), and therefore C does not affect any calculation from this point
onwards, and the correct convergence of FC is again guaranteed by Remark 4.6 on the convergence for
satisfiable trees. The more intricate case is if C contains xi positively. Since we are in the free case, it
must hold that C → x = 0. Therefore using Proposition 4.8 one obtains that FC is satisfiable (regardless
of the assignment of xi ), and WP will converge as required (again Remark 4.6).
     Now consider a tree hanging on a cycle clause. Namely, Ci+1 = (xi ∨ xi+2 ∨ y), where xi , xi+2 are
cycle variables, and (Ci+1 , y) is a tree edge. If one of the cycle orientations converged to 0, then Ci+1 → y
converges to 0, and then Remark 4.6 guarantees correct convergence. The same applies to the case
where Ci+1 → y converges to 1 and y is positive in Ci+1 (since then we can remove y from the tree, use
Remark 4.6 for the remaining part, then add back y, and set it to TRUE without causing any conflict with
the tree assignment, but satisfying Ci+1 according to the planted assignment).
     The delicate case remains when Ci+1 → y converges to 1 but y’s polarity in Ci+1 disagrees with
ϕ, that is, y is negative in Ci+1 . The key observation is that the message y → Ci+1 (which is directed
towards the cycle) must have converged to a positive value (otherwise, Ci+1 → xi and Ci+1 → xi+2 would
have converged to 0). However this complies with the scenario of Proposition 4.9, and again correct

                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                               639
                         U RIEL F EIGE , E LCHANAN M OSSEL , AND DAN V ILENCHIK

convergence is guaranteed.

     In Theorem 1.1 the unassigned variables are required to induce a “simple” formula, which is satisfiable
in linear time. Observe that the factor graph induced by the UNASSIGNED variables consists of connected
components whose structure is a cycle with trees hanging on it, or just a tree. A formula whose factor
graph is a tree can be satisfied in linear time by starting with the leaves (which are determined uniquely
in case that the leaf is a clause—namely, a unit clause, or if the leaf is a variable then it appears only in
one clause, and can be immediately assigned) and proceeding recursively. Regarding the cycle, consider
an arbitrary variable x on the cycle. By assigning x and simplifying accordingly, we remain with a tree.
Since there are only two ways to assign x, the whole procedures is linear in the size of the connected
component. This completes the proof of Theorem 1.1.

4.5.3   Proof of Proposition 4.14
Since the cycle has no pure literal it must be of the following form: C1 = (`x1 ∨ `x2 ),C2 = (`x2 ∨
`x3 ), . . . ,CL = (`xL ∨ `x1 ).
     Consider one of the directed cycles, say: x1 → C1 → x2 → · · · and note that when the message xi → Ci
is updated it obtains the current value of Ci−1 → xi and when the message Ci → xi+1 is updated, it obtains
the current value of xi → Ci .
     It thus suffices to show that the process above converges to all 0 or all 1 in time polynomial in the
cycle length. This we prove in the lemma below.

Claim 4.16. Consider the following process on {0, 1}L . Given the state of the process Γi at round i, the
state Γi+1 at round i + 1 is defined by choosing a permutation σ ∈ Sk uniformly at random. Then let
∆0 = Γi and for 1 ≤ j ≤ L, let ∆ j be obtained from ∆ j−1 by setting ∆ j (σ ( j − 1)) = ∆ j−1 ((σ ( j − 1) + 1)
mod L) and ∆ j (i) = ∆ j−1 (i) for all i 6= σ ( j − 1). Finally, let Γi+1 = ∆L .
    Let T be the stopping time where the process hits the state all 0 or all 1. Then

                                            Pr[T ≥ 4aL2 ] ≤ L2−a .                                       (4.1)

for all a ≥ 1 integer.

    The proof of Proposition 4.16 is based on the fact that the process defined in this lemma is a martingale.
This is established in the following two lemmas.

Lemma 4.17. Consider the following variant Γ˜i of the process Γi defined in Proposition 4.16. In the
variant, different intervals of 0/1 are assigned different colors and the copying procedure is as above. Fix
one color and let Xi denote the number of elements of the cycle of that color in Γ̃i . Then Xi is a martingale
with respect to the filtration defined by Γ̃i .

Proof. From the definition of the copying process it is easy to see that

                                     E[Xi+1 |Γ̃i , Γ̃i−1 , . . .] = E[Xi+1 |Xi ] .

We will show below that E[Xi+1 |Xi ] = 0 and thus that Xi is a martingale with respect to the filtration Γ̃i .

                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                               640
                  C ONVERGENCE OF M ESSAGE PASSING FOR S ATISFIABILITY P ROBLEMS

    Assume that Xi = k. Then we may assume that the configuration Γ̃i consists of an interval of 1’s of
length k and an interval of 0’s of length L − k. We calculate separately the expected shifts in the locations
of left end-points of the 0 and 1 interval respectively. We denote the two shift random variables by L0
and L1 . Clearly L0 = I0,1 + I0,2 + · · · + I0,k−1 where I0, j is the indicator of the event that 0 left-end point
shifted by at least j and similarly for L1 . Note that
                                                        1       1
                                          E[I0, j ] =     −
                                                        j! (L − k + j)!
and that
                                                      1       1
                                           E[I1, j ] =  −          .
                                                      j! (k + j)!
The last equation follows from the fact that in order for the 1 interval to extend by at least j, the j copying
has to take place in the correct order and it is forbidden that they all took place in the right order and the
interval has become a 0 interval. The previous equation is derived similarly. Thus
                                                L−k                    k                      
                                                      1      1                1          1
        E[(Xi+1 − Xi )|Xi ] = E[L1 ] − E[L0 ] = ∑       −            −∑         −                   = 0.
                                                j=1   j! (k + j)!       j=1   j! (L − k + j)!

This concludes the proof that Xi is a martingale. The proof follows.

     The proof of Proposition 4.16 follows by a union bound from the following lemma where the union
is taken over all intervals.
Lemma 4.18. Consider the process Γ˜i defined in Lemma 4.17. Fix one interval and let Xi denote its
length. Let T be the stopping time where Xi equals either 0 or L. Then

                                             Pr[T ≥ 4aL2 ] ≤ 2−a .

Proof. In order to bound the hitting probability of 0 and L, we need some bounds on the variance of the
martingale differences. In particular, we claim that unless k = 0 or k = L it holds that

                                        E[(Xt+1 − Xt )2 |Xt = k] ≥ 1/2 .

If k = 1 or k = L − 1 this follows since with probability at least 1/2 the end of the interval will be hit.
Otherwise, it is easy to see that the probability that Xt+1 − Xt is at least 1 is at least 1/4 and similarly the
probability that it is at most −1 is at least 1/4. This can be verified by considering the event that one
end-points moves by at least 2 and the other one by at most 1.
     Let T be the stopping time when XT hits 0 or n. Then by a Wald kind of calculation we obtain:
                                                ∞                    2 
                       L2 ≥ E[(XT − X0 )2 ] = E ∑ 1(T ≥ t)(Xt − Xt−1 )
                                                        t=1
                              ∞                                       
                          = E ∑ (Xt − Xt−1 )(Xs − Xs−1 )1(T ≥ max{t, s}
                               t,s=1
                               ∞                      1 ∞
                          = E ∑ (Xt − Xt−1 )2 1(T ≥ t) ≥ ∑ P[T ≥ t] = E[T ]/2 ,
                              t=1                       2 t=1


                      T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                                 641
                         U RIEL F EIGE , E LCHANAN M OSSEL , AND DAN V ILENCHIK

where the first equality in the last line follows from the fact that if s < t say then:

                      E[(Xt − Xt−1 )(Xs − Xs−1 )1(T ≥ maxt, s)] =
                       = E[(Xt − Xt−1 )(Xs − Xs−1 )(1 − 1(T < t))] =
                       = E[E[(Xt − Xt−1 )(Xs − Xs−1 )(1 − 1(T < t))|X1 , . . . , Xt−1 ]] =
                       = E[(Xs − Xs−1 )(1 − 1(T < t))E[Xt − Xt−1 |X1 , . . . , Xt−1 ]] = 0 .

   We thus obtain that E[T ] ≤ 2L2 . By Markov’s inequality, this implies in turn that Pr[T ≥ 4L2 ] ≤ 1/2.
Since Xt is a Markov chain it now follows that for all a ∈ N,

                                    Pr[T ≥ 4aL2 |T ≥ 4(a − 1)L2 ] ≤ 1/2 ,

so
                              a                         a
                                   Pr[T ≥ 4bL2 ]
           Pr[T ≥ 4aL2 ] = ∏                      2]
                                                     = ∏   Pr[T ≥ 4bL2 |T ≥ 4(b − 1)L2 ] ≤ 2−a .
                            b=1 Pr[T ≥  4(b − 1)L      b=1

The proposition follows.



5    Proof of Proposition 3.11
We shall start with the proof of item (2). Then we shall remark how to adjust this proof to prove (3).
Item (1) was proven, though with respect to a slightly different definition of a core, in [18]. The proof of
item (1) is identical to that in [18], with the adjustments made along the proof of item (2). Details of the
proof of (1) are omitted.
    In order to prove Proposition 3.11 (2) it suffices to prove that w. h. p. there are no two cycles with a
simple path (maybe of length 0) connecting the two. To this end, we consider all possible constellations
of such prohibited subgraphs and prove the proposition using a union bound over all of them.
    Every simple 2k-cycle in the factor graph consists of k variables, say x1 , . . . , xk (all different), and k
clauses C1 , . . . ,Ck , such that xi , xi+1 ∈ Ci . The cycle itself consists of 2k edges.
    As for paths, we have 3 different types of paths: paths connecting a clause in one cycle with a variable
in the other (type 1), paths connecting two clauses (type 2), and paths connecting two variables (type 3).
Clause-variable paths are always of odd length, and clause-clause, variable-variable paths are always of
even length. A k-path P consists of k edges. If it is a clause-variable path, it consists of (k − 1)/2 clauses
and the same number of variables. If it is a variable-variable path, it consists of k/2 − 1 variables and k/2
clauses and symmetrically for the clause-clause path (we don’t take into account the clauses/variables
that participate in the cycle, only the ones belonging exclusively to the path).
    Our prohibited graphs consist of two cycles C1 ,C2 and a simple path P connecting them. We call a
graph containing exactly two simple cycles and a simple path connecting them a bi-cycle. The path P can
be of either one of the three types described above. Similarly to the bi-cycle case, one can have a cycle C
and a chord P in it. We call such a cycle a chord-cycle. For parameters i, j, k ∈ [1, n], and t ∈ {1, 2, 3},
we denote by B2i,2 j,k,t a bi-cycle consisting of a 2i-cycle connected by a k-path of type t to a 2 j-cycle.
Similarly, we denote by B2i,k,t a chord-cycle consisting of a 2i-cycle with a k-path of type t as a chord.

                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                                642
                  C ONVERGENCE OF M ESSAGE PASSING FOR S ATISFIABILITY P ROBLEMS

     Our goal is then to prove that w. h. p. the graph induced by the non-core variables contains no bi-cycles
and no chord-cycles.
     For a fixed factor graph H we let FH ⊆ F be a fixed minimal set of clauses inducing H, and V (H)
be the set of variables in H. In order for a fixed graph H to belong to the factor graph induced by
the non-core variables it must be that there exists some FH such that FH ⊆ F and that V (H) ⊆ H̄ (put
differently, V (H) ∩ H = 0). /
     Let B = B2i,2 j,k,t (or B = B2i,k,t if B is a chord-cycle) be a fixed bi-cycle and FB a fixed minimal-set of
clauses inducing B. We start by bounding Pr[FB ⊆ F and V (B) ∩ H = 0]        / and then use the union bound over
all possible bi-cycles (chord-cycles) and inducing minimal sets of clauses. As the two events—{FB ⊆ F}
and {V (B) ∩ H = 0}—are
                      /        not independent, the calculations are more involved. Loosely speaking, to
circumvent the dependency issue, one needs to defuse the effect that the event {FB ⊆ F} might have on
H. To this end we introduce a set H∗ , defined very similarly to H only “cushioned” in some sense to
overcome the dependency issues (the “cushioning” depends on FB ). This is done using similar techniques
to [4, 18].
     We start by defining the new set of core variables H∗ (again w. r. t. an ordering π of the clause-variable
messages and an initial values vector α). The changes compared to H are highlighted in bold.

     Let B1 be the set of variables whose support w. r. t. ϕ is at most d/3.
     Let B2 be the set of non-stable variables w. r. t. π where we redefine the gap in Definition 3.5 to
         be (d/30 − 6) in (1) and (2).
     Let B3 be the set of stable variables w. r. t. π which are violated by α where we redefine the gap
         in Definition 3.7 to be (d/30-6) in (1) and (2).
     Let J ⊆ V (B) be the set of variables appearing in no more than 6 different clauses in FB

         1. Set H00 = V \ (B1 ∪ B2 ∪ B3 ∪ (V (FC ) \ J)).

         2. While there exists a variable ai ∈ Hi0 which supports less than d/4 clauses in F[Hi0 ] OR
            appears in more than (d/30-6) clauses not in F[Hi0 ], define Hi+1
                                                                          0   = Hi0 \ {ai }.

         3. Let am be the last variable removed at step 2. Define H∗ = Hm+1
                                                                        0        0
                                                                            . = Hm+1 .


Propositions 3.6 and 3.8 could be easily adjusted to accommodate the 6-gap in the new definition in B2
and B3 . Therefore Proposition 3.10 can be safely restated in the context of H∗ :
Proposition 5.1. If both α and π are chosen uniformly at random then w. h. p. #H∗ ≥ (1 − e−Θ(d) )n.
Proposition 5.2. Let b = #V (B), then the set J defined above satisfies #J ≥ b/4.
Proof. Observe that if FB is minimal then #FB ≤ b + 1. This is because in every cycle the number of
variables equals the number of clauses, and in the worst case, the path contains at most one more clause
than the number of variables, and the same goes for the chord-cycle. Now suppose in contradiction that
#J < b/4, then there are more than 3b/4 variables in V (B), each appearing in at least 6 different clauses
in FB . Thus, #FB > (6 · 3b/4)/3 = 1.5b > b + 1 (we divided by three as every clause might have been
counted 3 times. The last inequality is true since b ≥ 3), contradicting #FB ≤ b + 1.

                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                                 643
                         U RIEL F EIGE , E LCHANAN M OSSEL , AND DAN V ILENCHIK

   The following proposition “defuses” the dependency between the event that a bi-cycle (chord-cycle)
was included in the graph and the fact that it does not intersect the core variables. In the following
proposition we fix an arbitrary π and α in the definition of H∗ , therefore the probability is taken only
over the randomness in the choice of F.

                                           / ≤ Pr[FB ⊆ F] · Pr[J ∩ H∗ = 0].
Proposition 5.3. Pr[FB ⊆ F and V (B) ∩ H = 0]                           /

Before proving Proposition 5.3, we establish the following fact.

Lemma 5.4. For every bi-cycle (chord-cycle) B and every minimal inducing set FB , H∗ (F, ϕ, α, π) ⊆
H(F ∪ FB , ϕ, α, π).

Proof. The lemma is proved using induction on i (i being the iteration counter in the construction of H).
For the base case H00 (F) ⊆ H0 (F ∪ FB ), since every variable in H00 (F) appears in at most 6 clauses in FB
it holds that Ai (F ∪ FB ) ⊆ Bi (F), i = 2, 3. A1 (F ∪ FB ) ⊆ B1 (F) holds at any rate as more clauses can only
increase the support, and the set J was not even considered for H0 . Suppose now that Hi0 (F) ⊆ Hi (F ∪ FC ),
and prove the lemma holds for iteration i + 1. If x ∈ Hi+1  0 (F) then x supports at least d/3 clauses in which
                       0               0
all variables are in Hi (F). Since Hi (F) ⊆ Hi (F ∪ FB ), then x supports at least this number of clauses with
only variables of Hi (F ∪ FC ). Also, x appears in at most d/30 − 6 clauses with some variable outside
of Hi0 (F), again since Hi0 (F) ⊆ Hi (F ∪ FB ) and FB contains at most 6 clauses containing x, x will appear
in no more than d/30 clauses each containing some variable not in Hi (F ∪ FB ). We conclude then that
x ∈ Hi (F ∪ FB ).

    This lemma clarifies the motivation for defining H∗ . It is not necessarily true that H(F) ⊆ H(F ∪ FB ).
For example, a variable which appears in H(F) could disappear from H(F ∪ FB ) since the clauses in FB
make it unstable. Loosely speaking, H∗ is cushioned enough to prevent such a thing from happening.


Proof. (Proof of Proposition 5.3)

    Pr[FB ⊆ F and V (B) ∩ H = 0]
                              / ≤ Pr[FB ⊆ F and J ∩ H = 0]
                                                        / = Pr[J ∩ H = 0/ | FB ⊆ F] Pr[FB ⊆ F] .

Therefore, it suffices to prove

                                     Pr[J ∩ H = 0/ | FB ⊆ F] ≤ Pr[J ∩ H∗ = 0]
                                                                           / .

                 Pr[J ∩ H∗ = 0]
                             / =              ∑          Pr[F = F]          ≥              ∑           Pr[F = F]
                                        F:J∩H∗ (F)=0/
                                                                           |{z}
                                                                                     F:J∩H(F∪FB )=0/
                                                                         Lemma 5.4

Break each set of clauses F into F 0 = F \ FB and F 00 = F ∩ FB , and the latter equals

                                   ∑                      ∑          Pr[F \ FB = F 0 and F ∩ FB = F 00 ]
                     F 0 :F 0 ∩FB =0,J∩H(F
                                   /              / F 00 :F 00 ⊆FB
                                           0 ∪F )=0
                                               B



                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                                       644
                  C ONVERGENCE OF M ESSAGE PASSING FOR S ATISFIABILITY P ROBLEMS

Since the two sets of clauses, F \ FB , and F ∩ FB , are disjoint, and clauses are chosen independently, the
last expression equals,

                                   ∑                      ∑          Pr[F \ FB = F 0 ] Pr[F ∩ FB = F 00 ] =
                     F 0 :F 0 ∩FB =0,J∩H(F
                                   /              / F 00 :F 00 ⊆FB
                                           0 ∪F )=0
                                               B

                                   ∑                  Pr[F \ FB = F 0 ]          ∑            Pr[F ∩ FB = F 00 ] =
                     F 0 :F 0 ∩FB =0,J∩H(F
                                   /       0 ∪F )=0
                                               B  /                          F 00 :F 00 ⊆FB
                                                                             |                  {z            }
                                                                                                 1

                                   ∑                  Pr[F \ FB = F 0 ] .
                     F 0 :F 0 ∩FB =0,J∩H(F
                                   /       0 ∪F )=0
                                               B  /


Since (F \ FB ) ∩ FB = 0,
                       / and clauses are chosen independently, the event {FB ⊆ F} is independent of the
                   0
event {F \ FB = F }. Therefore, the latter expression can be rewritten as

                               ∑                  Pr[F \ FB = F 0 | FB ⊆ F] = Pr[J ∩ H = 0/ | FB ⊆ F] .
                F 0 :F 0 ∩FB =0,J∩H(F
                              /       0 ∪F )=0
                                          B  /


Lemma 5.5. Let B = B2i,k,t be a chord-cycle, then Pr[V (B) ∩ H∗ = 0/ | #H∗ = λ n] ≤ p(i, j, k) where:
                      k
   1. p(i, k) ≤ λ (i+ 2 −1)/4 if B consists of a 2i-cycle and a variable-variable k-path as a chord.
                      k
   2. p(i, k) ≤ λ (i+ 2 )/4 if B consists of 2i-cycle and a clause-clause k-path as a chord.
                     k−1
   3. p(i, k) ≤ λ (i+ 2 )/4 if B consists of 2i-cycle and a variable-clause k-path as a chord.

Proof. In item 1, we have i + (k/2) − 1 variables and i + (k/2) clauses. To bound the event {J ∩ H∗ = 0}, /
given the size of H∗ , observe that FB is fixed in the context of this event, and there is no pre-knowledge
whether FB is included in F or not. Therefore, J can be treated as a fixed set of variables, thus the choice
of H∗ is uniformly distributed over J. Recalling that #J ≥ (i + (k/2) − 1)/4, it follows that

                                                                n−#H∗
                                                                                       λn    
                               ∗              ∗                                   (i+ 2k −1)/4          k
                   Pr[J ∩ H = 0/ | #H = λ n] ≤                    #J 
                                                                  n          =           n     ≤ λ (i+ 2 −1)/4 .
                                                                  #J              (i+ 2k −1)/4

The last inequality follows from standard bounds on the binomial coefficients. This proves item 1. In the
same manner items 2, 3 are proven (just counting how many variables and clauses B contains, depending
on the type of its path).

Lemma 5.6. Let B = B2i,2 j,k,t be a bi-cycle, then Pr[V (B) ∩ H∗ = 0|#H
                                                                   /    ∗
                                                                          = λ n] ≤ p(i, j, k) where:
                           k
   1. p(i, j, k) ≤ λ (i+ j+ 2 −1)/4 if B consists of a 2i,2 j-cycles and a variable-variable k-path.
                           k
   2. p(i, j, k) ≤ λ (i+ j+ 2 )/4 if B consists of 2i,2 j-cycles and a clause-clause k-path.
                           k−1
   3. p(i, j, k) ≤ λ (i+ j+ 2 )/4 if B consists of 2i,2 j-cycles and a variable-clause k-path.

                      T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                                        645
                                   U RIEL F EIGE , E LCHANAN M OSSEL , AND DAN V ILENCHIK

Lemma 5.6 is proven in a similar way to Lemma 5.5.
    To complete the proof of Proposition 3.11, we use the union bound over all possible bi/chord-
cycles. We present the proof for the bi-cycle case with a variable-variable path; the proof for all other
cases is identical. s = si, j,k = i + j + (k/2) − 1 (namely, #V (B) = s and #FB = s + 1). We also let
pλ = Pr[#H∗ = λ n]. The probability of B is then at most
          n                n
                                                         d si, j,k +1 si, j,k /4
                                                                                 
                                       n                               si, j,k +1
        ∑ pλ · ∑k si, j,k · (si, j,k )! · (7n)         · 2
                                                         n
                                                                     ·λ          ≤
        λ =0     i, j, 2 =1

                                   7en s s s+1
          n            n                         s+1               n          n
                                                   d                                            7d
             p
        ∑ λ ∑k ·            7d ·
                                    s
                                        · s · n ·
                                                   n 2
                                                           · λ s/4
                                                                   ≤ ∑ λ ∑k (7e · d · λ 1/4 )s · n .
                                                                          p    ·
        λ =0          i, j, 2 =1                                     λ =0                                i, j, 2 =1


Let us now break the first sum according to the different values of λ . Proposition 3.10 implies that
pλ < n−5 for all λ > d −8 . Therefore the contribution of this part to the (double) summation is O(1/n).
For λ < d −8 , the latter simplifies to

                                                               n/d 8            n         s
                                                                                          1    7d
                                                               ∑ pλ ∑                         · .                                     (5.1)
                                                                                          2     n
                                                               λ =0        i, j, 2k =1

Finally observe that
          n   s                                                                           s
               1     7d                                        7d                           1                  7d        1
         ∑k 2 · n ≤                              ∑                +            ∑                ≤ (4 log n)3 ·    + n3 · 4 = o(1) .
                                                                n                           2                   n       n
       i, j, 2 =1                          i+ j+ 2k ≤4 log n           i+ j+ 2k ≥4 log n


Therefore, (5.1) sums up to o(1).

5.1   Proof of Proposition 3.11 (3)
The proof is basically the same as that of Proposition 3.11 (2). One defines the same notion of “cushioned”
core H∗ , and proceeds similarly. We therefore reprove only the last part—the union bound over all
possible cycles.
    First let us bound the number of cycles of length k. There are nk ways to choose the variables
                                                                             

inducing the cycle, and k!/2 ways to order them on the cycle. As for the set of clauses that induces the
cycle, once the cycle is fixed, we have at most (7n)k ways of choosing the third variable and setting the
polarity in every clause. We also let pλ = Pr[#H∗ = λ n].
    Using the union bound, the probability of a cycle of length at least k in the non-core factor graph is at
most
                                             t
                                                                         7en t t t
             n       n                                     n       n                  t
                          n             t    d       t/2                                  d
                p ·
            ∑ λ ∑ t          · t! · (7n)  ·    2
                                                 · λ     ≤      p
                                                           ∑ λ ∑ t·            · t · n ·   2
                                                                                              · λ t/2
           λ =0     t=k                      n             λ =0     t=k                   n
                n       n            √
            = ∑ pλ · ∑ (7e · d · λ )t .
                    λ =0       t=k


                               T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                                                 646
                  C ONVERGENCE OF M ESSAGE PASSING FOR S ATISFIABILITY P ROBLEMS

Let us now break the first sum according to the different values of λ . Proposition 3.10 implies that there
exists a constant c such that pλ < n−3 for λ >√e−cd . Therefore the contribution of this part to the (double)
summation is O(1/n). For λ ≤ e−cd , 7e · d · e−cd = e−Θ(d) . In this case, the last summation is simply
the sum of a decreasing geometric series with quotient e−Θ(d) , which sums up to at most twice the first
term, namely e−Θ(dk) . The proposition then follows.


6    Discussion
Our results show that WP is effective in solving Pplant
                                                     n,p . Though not being the first to give an algorithm for
  plant
Pn,p , our results are a first example of rigorously analyzing a message passing algorithm on a natural
non-trivial random SAT distribution. We remark that our goal was to analyze WP under its most common
definition, resisting attempts to modify the algorithm in ways that would simplify the analysis. One such
simplification would result if in the first few iterations, messages are updated in parallel in two phases:
clause-variable messages updates and then variable-clause messages updates.
     In the non-planted case, for low density formulas (considerably below the satisfiability threshold),
some algorithms were rigorously shown to find w. h. p. a satisfying assignment efficiently [3, 8]. For
random k-SAT, k a sufficiently large constant, the best rigorously analyzed algorithm is given in [10].
Experimental results predict that as the density of the formula increases, more sophisticated algorithms
are needed in order to find a satisfying assignment.
     One possible explanation for the increasing computational hardness of finding solutions in the non-
planted case is based on the geometry of the space of satisfying assignment. It is now established [2, 1]
that for k-SAT just below the satisfiability threshold and k ≥ 8, the space of solutions decomposes into an
exponential number of (Hamming-distance) connected clusters such that the distance between each two
is linear in the number of variables. Such complex geometry of the space of solutions poses a complex
algorithmic challenge.
     Our results, and similarly [18, 17], show that in the planted case (with density some large constant
above the satisfiability threshold), the algorithmic task of finding a satisfying assignment is relatively
easy, and in particular, the naïve WP algorithm is effective in finding satisfying assignments. Planted
formulas in this regime have only one cluster of satisfying assignments (see [11] for more discussion).
     We conclude with an open problem. Can our analysis be extended to show that Belief Propagation
(BP) finds a satisfying assignment to Pplant
                                          n,p in the setting of Theorem 1.1? Experimental results predict
the answer to be positive. However, our analysis of WP does not extend as is to BP. In WP, all warnings
received by a variable (or by a clause) have equal weight, but in BP this need not be the case (there is a
probability level associated with each warning). In particular, this may lead to the case that messages
received from non-core portions of the formula can affect the core, a possibility that our analysis managed
to exclude for the WP algorithm.


Acknowledgement
We thank Eran Ofek for many useful discussions. This work was done while the authors were visiting
Microsoft Research, Redmond, Washington.

                     T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                             647
                       U RIEL F EIGE , E LCHANAN M OSSEL , AND DAN V ILENCHIK

References
 [1] D IMITRIS ACHLIOPTAS AND A MIN C OJA -O GHLAN: Algorithmic barriers from phase transitions.
     In Proc. 49th FOCS, pp. 793–802. IEEE Comp. Soc. Press, 2008. [doi:10.1109/FOCS.2008.11] 647

 [2] D IMITRIS ACHLIOPTAS , A MIN C OJA -O GHLAN , AND F EDERICO R ICCI -T ERSENGHI: On the
     solution-space geometry of random constraint satisfaction problems. Random Structures & Algo-
     rithms, 38(3):251–268, 2011. Preliminary version in STOC’06. [doi:10.1002/rsa.20323] 647

 [3] M IKHAIL A LEKHNOVICH AND E LI B EN -S ASSON: Linear upper bounds for random walk on
     small density random 3-CNFs. SIAM J. Comput., 36(5):1248–1263, 2007. Preliminary version in
     FOCS’03. [doi:10.1137/S0097539704440107] 620, 621, 647

 [4] N OGA A LON AND NABIL K AHALE: A spectral technique for coloring random 3-colorable
     graphs.    SIAM J. Comput., 26(6):1733–1748, 1997.      Prelimiary version in STOC’94.
     [doi:10.1137/S0097539794270248] 618, 621, 624, 627, 643

 [5] N OGA A LON , M ICHAEL K RIVELEVICH , AND B ENNY S UDAKOV: Finding a large hidden clique in
     a random graph. Random Structures & Algorithms, 13(3-4):457–466, 1998. Preliminary version in
     SODA’98. [doi:10.1002/(SICI)1098-2418(199810/12)13:3/4<457::AID-RSA14>3.0.CO;2-W] 618

 [6] N OGA A LON AND J OEL S PENCER: The Probabilistic Method. Wiley-Interscience Series in Discrete
     Mathematics and Optimization. Wiley, 2nd edition, 2000. 624, 626

 [7] A LFREDO B RAUNSTEIN , M ARC M ÉZARD , AND R ICCARDO Z ECCHINA: Survey propaga-
     tion: An algorithm for satisfiability. Random Structures & Algorithms, 27(2):201–226, 2005.
     [doi:10.1002/rsa.20057] 618, 634

 [8] A NDREI Z. B RODER , A LAN M. F RIEZE , AND E LI U PFAL: On the satisfiability and maximum
     satisfiability of random 3-CNF formulas. In Proc. 4th Ann. ACM-SIAM Symp. on Discrete Algorithms
     (SODA’93), pp. 322–330. ACM Press, 1993. [ACM:313559.313794] 620, 647

 [9] H UI C HEN AND A LAN M. F RIEZE: Coloring bipartite hypergraphs. In Proc. 5th Ann. Conf. on
     Integer Programming and Combinatorial Optimization (IPCO ’96), pp. 345–358, London, UK,
     1996. Springer. [doi:10.1007/3-540-61310-2_26] 618, 621

[10] A MIN C OJA -O GHLAN: A better algorithm for random k-SAT. SIAM J. Comput., 39(7):2823–2864,
     2010. Preliminary version in ICALP’09. [doi:10.1137/09076516X] 647

[11] A MIN C OJA -O GHLAN , M ICHAEL K RIVELEVICH , AND DAN V ILENCHIK: Why almost all k-CNF
     formulas are easy. In Proc. 13th Internat. Conf. on Analysis of Algorithms (AofA’07), pp. 89–102.
     DMTCS, 2007. DMTCS. 620, 647

[12] A MIN C OJA -O GHLAN AND A NGELICA Y. PACHON -P INZON: The decimation process in random
     k-SAT. SIAM J. Discrete Math., 26(4):1471–1509, 2012. Preliminary versions in ICALP’11 and
     SODA’11. [doi:10.1137/110842867] 618

                   T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                        648
                C ONVERGENCE OF M ESSAGE PASSING FOR S ATISFIABILITY P ROBLEMS

[13] S TEPHEN A. C OOK: The complexity of theorem-proving procedures. In Proc. 3rd STOC, pp.
     151–158. ACM Press, 1971. [doi:10.1145/800157.805047] 617

[14] JAMES M. C RAWFORD AND L ARRY D. AUTON: Experimental results on the crossover point in ran-
     dom 3-SAT. Artif. Intell., 81(1-2):31–57, 1996. Preliminary version in AAAI’93. [doi:10.1016/0004-
     3702(95)00046-1] 618

[15] O LIVIER D UBOIS , YACINE B OUFKHAD , AND JACQUES M ANDLER: Typical random 3-SAT for-
     mulae and the satisfiability threshold. In Proc. 11th Ann. ACM-SIAM Symp. on Discrete Algorithms
     (SODA’00), pp. 126–127. ACM Press, 2000. [ACM:338243] 618

[16] U RIEL F EIGE , E LCHANAN M OSSEL , AND DAN V ILENCHIK: Complete convergence of message
     passing algorithms for some satisfiability problems. In Proc. 10th Internat. Workshop on Random-
     ization and Computation (RANDOM’06), pp. 339–350. Springer, 2006. [doi:10.1007/11830924_32]
     617

[17] U RIEL F EIGE AND DAN V ILENCHIK: A local search algorithm for 3SAT. Technical report, The
     Weizmann Institute of Science, 2004. Available at author’s home page. 623, 647

[18] A BRAHAM F LAXMAN: A spectral technique for random satisfiable 3CNF formulas. Random Struc-
     tures & Algorithms, 32(4):519–534, 2008. Preliminary version in SODA’03. [doi:10.1002/rsa.20213]
     618, 621, 623, 624, 627, 629, 642, 643, 647

[19] E HUD F RIEDGUT: Sharp thresholds of graph properties, and the k-SAT problem. J. Amer. Math.
     Soc., 12(4):1017–1054, 1999. [doi:10.1090/S0894-0347-99-00305-7] 618

[20] WASSILY H OEFFDING: Probability inequalities for sums of bounded random variables. J. Amer.
     Stat. Assoc., 58(301):13–30, 1963. JSTOR. 626

[21] A LEXIS C. K APORIS , L EFTERIS M. K IROUSIS , AND E FTHIMIOS G. L ALAS: The probabilistic
     analysis of a greedy satisfiability algorithm. Random Structures & Algorithms, 28(4):444–480,
     2006. Preliminary version in ESA’02. [doi:10.1002/rsa.20104] 618

[22] E LIAS KOUTSOUPIAS AND C HRISTOS H. PAPADIMITRIOU: On the greedy algorithm for satisfia-
     bility. Inform. Process. Lett., 43(1):53–55, 1992. [doi:10.1016/0020-0190(92)90029-U] 618

[23] M ICHAEL K RIVELEVICH AND DAN V ILENCHIK: Solving random satisfiable 3CNF formulas in
     expected polynomial time. In Proc. 17th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’06),
     pp. 454–463. ACM Press, 2006. [doi:10.1145/1109557.1109608] 618

[24] M ICHAEL L UBY, M ICHAEL M ITZENMACHER , A MIN S HOKROLLAHI , AND DANIEL A. S PIEL -
     MAN : Efficient erasure correcting codes. IEEE Trans. Inform. Theory, 47(2):569–584, 2001.
     [doi:10.1109/18.910575] 621

[25] M ICHAEL L UBY, M ICHAEL M ITZENMACHER , A MIN S HOKROLLAHI , AND DANIEL A. S PIEL -
     MAN: Improved low-density parity-check codes using irregular graphs. IEEE Trans. Inform. Theory,
     47(2):585–598, 2001. Preliminary versions in ISIT’98 and STOC’98. [doi:10.1109/18.910576] 621

                   T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                         649
                       U RIEL F EIGE , E LCHANAN M OSSEL , AND DAN V ILENCHIK

[26] J UDEA P EARL: Reverend Bayes on inference engines: A distributed hierarchical approach. In Proc.
     2nd Nat. Conf. on Artificial Intelligence (AAAI’82), pp. 133–136, 1982. AAAI. 618

[27] T HOMAS J. R ICHARDSON , A MIN S HOKROLLAHI , AND R ÜDIGER L. U RBANKE: Design of
     capacity-approaching irregular low-density parity-check codes. IEEE Trans. Inform. Theory,
     47(2):619–637, 2001. [doi:10.1109/18.910578] 621


AUTHORS

      Uriel Feige
      Professor
      The Weizmann Institute of Science, Rehovot, Israel
      uriel.feige weizmann ac il
      http://www.wisdom.weizmann.ac.il/~feige/


      Elchanan Mossel
      Professor
      U.C. Berkeley, CA, USA
      mossel stat berkeley edu
      http://www.stat.berkeley.edu/~mossel/


      Dan Vilenchik
      Postdoctoral fellow
      The Weizmann Institute of Science, Rehovot, Israel
      dan.vilenchik weizmann ac il
      http://www.wisdom.weizmann.ac.il/~vilenchi/


ABOUT THE AUTHORS

      U RIEL F EIGE is a professor at the Weizmann Institute, though this work was done while he
         was a member of the theory group at Microsoft Research. His main research interests
         involve studying the borderline between P and NP as it manifests itself in approximation
         algorithms, heuristics, and exact algorithms that are not necessarily polynomial time.




                   T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                          650
          C ONVERGENCE OF M ESSAGE PASSING FOR S ATISFIABILITY P ROBLEMS

E LCHANAN M OSSEL is a Professor of Statistics and Computer Science at U. C. Berkeley.
   He received his Ph. D. in mathematics from Hebrew University in 2000. After his Ph. D.,
   he was a post-doc at the Theory Group of Microsoft Research, Redmond and a Miller
   fellow in Statistics and Computer Science here at U. C. Berkeley. His research interests
   include Combinatorial Statistics, Discrete Fourier Analysis and Influences, Randomized
   Algorithms, Computational Complexity, MCMC, Markov Random Fields, Social Choice,
   Game Theory and Evolution.


DAN V ILENCHIK graduated from Tel Aviv University in 2009; his advisor was Michael
  Krivelevich. His thesis focused on average-case analysis of optimization problems, in
  particular k-colorability and k-SAT. His interests include average-case analysis and
  combinational optimization.




             T HEORY OF C OMPUTING, Volume 9 (19), 2013, pp. 617–651                          651