Abstract

We consider ferromagnetic Ising models on graphs
that converge locally to trees. Examples include random regular
graphs with bounded degree and uniformly random graphs with
bounded average degree. We prove that the “cavity”
prediction for the
limiting
free energy per spin is correct for
any positive temperature and external field.
Further, local marginals can be approximated
by iterating a set of mean field (cavity) equations.
Both results are achieved by proving the local convergence
of the Boltzmann distribution on the original graph to the Boltzmann
distribution on the appropriate infinite random tree.

A ferromagnetic Ising model on the
finite
graph G (with vertex set V,
and edge set E) is defined by the following Boltzmann distributions over
x––={xi\dvtxi∈V},
with
xi∈{+1,−1}:

μ(x––)=1Z(β,B)exp{β∑(i,j)∈Exixj+B∑i∈Vxi}.

(1)

These distributions are parametrized by the
“magnetic field” B and “inverse temperature” β≥0,
where the partition function Z(β,B) is
fixed by the normalization condition ∑x––μ(x––)=1.
Throughout the paper, we will be interested in sequences of
graphs1Gn=(Vn≡[n],En) of diverging size n.

Nonrigorous statistical mechanics techniques, such as the “replica”
and “cavity methods,” allow to make a number of predictions
on the model (1), when the graph G
“lacks
any finite-dimensional structure.”
The most basic quantity in this context is the
asymptotic free entropy density

ϕ(β,B)≡limn→∞1nlogZn(β,B)

(2)

(this quantity is also
sometimes called in the literature also free energy or pressure).
The limit free entropy density and the large deviation properties
of Boltzmann distribution were characterized
in great detail [9] in the case of a complete
graph Gn=Kn (the inverse temperature must then be scaled by 1/n
to get a nontrivial limit).
Statistical physics predictions exist, however, for a much wider class
of graphs,
including most notably sparse random graphs with bounded average degree;
see, for instance, [8, 15, 18].
This is a direction of interest for at least two reasons:
{longlist}[(ii)]
(i)Sparse graphical structures arise in a number of problems from
combinatorics and theoretical computer science.
Examples include random satisfiability, coloring of random graphs and
graph partitioning [21].
In all of these cases, the uniform measure
over solutions can be regarded as the Boltzmann distribution
for a modified spin glass with multispin interactions.
Such problems have been successfully attacked using nonrigorous
statistical mechanics techniques.A mathematical foundation of this approach is still lacking, and would be
extremely useful.

Sparse graphs allow to introduce a nontrivial notion of distance
between vertices, namely the length of the shortest path
connecting them. This geometrical structure allows for new characterizations
of the measure (1) in terms of correlation decay.
This type of characterization is in turn related to the theory of Gibbs
measures on infinite trees [17].

The asymptotic free entropy density (2) was
determined rigorously only in a few cases for sparse graphs.
In [11], this task was accomplished for random regular graphs.
De Sanctis and Guerra [7] developed interpolation
techniques for random graphs with independent edges (Erdös–Renyi type)
but only determined the free entropy density at high temperature and at zero
temperature (in both cases with vanishing magnetic field).
The latter is in fact equivalent to counting the number of
connected components of a random graph.
Interestingly, the partition function Zn(β,B)
can be approximated in polynomial time for β≥0, using an appropriate
Markov chain Monte Carlo algorithm [14].
It is intriguing that no general approximation algorithms exists
in the case β<0 (the “antiferromagnetic” Ising model).
Correspondingly, the statistical physics conjecture for the
free entropy density [21] becomes significantly
more intricate (presenting the so-called “replica symmetry breaking”
phenomenon).

In this paper we generalize the previous results by
rigorously verifying the validity of the
Bethe free entropy prediction for the value of the
limit in (2) for generic graph
sequences that converge locally to trees.
Indeed, we control the
free entropy density by proving that the Boltzmann measure
(1)
converges locally to the Boltzmann measure of a model on
a tree. The philosophy is related
to the local weak convergence method of [2].

Finally, several of the proofs have an algorithmic interpretation,
providing an efficient procedure for approximating the local marginals
of the
Boltzmann measure. The essence of this procedure consists in solving
by iteration certain mean field (cavity) equations.
Such an algorithm is known in artificial intelligence and computer
science under the name of belief propagation. Despite its success
and wide applicability, only weak performance guarantees
have been proved so far. Typically, it is possible to prove its correctness
in the high temperature regime, as a consequence
of a uniform decay of correlations holding there (spatial mixing)
[26, 3, 23].
The behavior of iterative inference algorithms on Ising models
was recently considered in [22, 24].

The emphasis of the present paper is on
the low-temperature regime in which uniform decorrelation does not hold.
We are able to prove that belief propagation converges exponentially
fast on any graph, and that the resulting estimates are asymptotically
exact for large locally tree-like graphs. The main idea is to
introduce a magnetic field to break explicitly the +/− symmetry,
and to carefully exploit the monotonicity properties of the model.

A key step consists of estimating the correlation between the root
spin of an Ising model on a tree and positive boudary conditions.
Ising models on trees are interesting per se, and
have been the object of significant mathematical work; see, for instance,
[20, 16, 10]. The question considered here
appears, however, to be novel.

The next section provides the basic technical definitions (in particular
concerning graphs and local convergence to trees), and the formal
statement of our main results.
Notation and certain key tools are described in Section 3
with Section 4
devoted to proofs of the relevant properties of Ising models on trees
(which are of independent interest). The latter are used in
Sections 5 and 6
to derive our main results concerning models on tree-like graphs.
A companion paper [5] deals with the related
challenging problem of spin glass models on sparse graphs.

The next subsections contain some basic definitions on graph
sequences and the notion of local convergence to random trees.
Sections 2.2 and 2.3 present
our results on the free entropy density and the algorithmic implications
of our analysis.

2.1 Locally tree-like graphs

Let P={Pk\dvtxk≥0} a probability distribution over
the nonnegative integers, with finite,
positive first moment, and denote by

ρk=kPk∑∞l=1lPl,

(3)

its size-biased version. For any t≥0, we let
T(P,ρ,t) denote the random rooted tree generated as follows.
First draw an integer k with distribution Pk, and connect
the root to k offspring. Then recursively, for each node in
the last generation, generate an integer k independently with distribution
ρk, and connect the node to k−1 new nodes. This is repeated
until the tree has t generations.

Sometimes it will be useful to consider the ensemble T(ρ,t)
whereby the root node has degree k−1 with probability ρk.
We will drop the degree distribution arguments from
T(P,ρ,t) or T(ρ,t) and write T(t) whenever
clear from the context.
Notice that the infinite trees
T(P,ρ,∞) and T(ρ,∞) are
well defined.

The average branching factor of trees will be denoted by ¯¯¯ρ,
and the average root degree by ¯¯¯¯P. In formulae

¯¯¯¯P≡∞∑k=0kPk,¯¯¯ρ≡∞∑k=1(k−1)ρk.

(4)

We denote by Gn=(Vn,En) a graph with vertex set
Vn≡[n]={1,…,n}. The distance d(i,j) between
i,j∈Vn is the length of the shortest path from i to j in Gn. Given a vertex i∈Vn, we let
Bi(t) be the set of vertices whose distance from i is at most
t. With a slight abuse of notation, Bi(t) will also
denote the
subgraph induced by those vertices. For i∈Vn, we let ∂i denote
the set of its neighbors ∂i≡{j∈Vn\dvtx(i,j)∈En},
and |∂i| its size (i.e. the degree of i).

This paper is concerned by sequence of graphs {Gn}n∈N
of diverging size, that converge locally to trees.
Consider two trees T1 and T2 with vertices labeled arbitrarily.
We shall write T1≃T2 if the two trees become identical when vertices
are relabeled from 1 to |T1|=|T2|, in a
breadth first fashion, and following lexicographic order among siblings.
{definition}
Considering a sequence of graphs {Gn}n∈N, let
Pn denote the law induced on the ball Bi(t)
in Gn
centered at a uniformly chosen random vertex i∈[n].
We say that {Gn}converges locally to the random tree
T(P,ρ,∞) if, for any t, and any rooted tree T
with t generations

limn→∞Pn{Bi(t)≃T}=P{T(P,ρ,t)≃T}.

(5)

{definition}

We say that a sequence of graphs
{Gn}n∈N is uniformly sparse if

liml→∞limsupn→∞1n∑i∈Vn|∂i|I(|∂i|≥l)=0.

(6)

2.2 Free entropy

According to the statistical physics derivation [18], the model
(1)
has a line of first-order phase
transitions for B=0 and β>βc
[i.e., where the continuous function B↦ϕ(β,B)
exhibits a discontinuous derivative].
The critical temperature depends on the
graph only through the average branching factor and is determined by the
condition

¯¯¯ρtanhβc=1.

(7)

Notice that βc≃1/¯¯¯ρ for large degrees.

The asymptotic free-entropy density is given in terms of the
fixed point of a distributional recursion.
One characterization of this fixed point is as follows.

and the h(t)i’s
are i.i.d. copies of h(t) that are independent of K.
If B>0 and ρ has finite first moment, then the
distributions of h(t) are stochastically monotone
and h(t) converges in distribution to the unique fixed point h∗
of the recursion (8) that is
supported on [0,∞).

Our next result confirms the statistical physics prediction for the
free-entropy density.

Theorem 2.2

Let {Gn}n∈N be a sequence of uniformly sparse graphs
that converges locally to T(P,ρ,∞). If ρ
has finite first moment (that is if P has finite second moment),
then for any B∈R and β≥0 the following
limit exists:

limn→∞1nlogZn(β,B)=ϕ(β,B).

(10)

Moreover, for B>0 the limit is given by

ϕ(β,B)

≡

¯¯¯¯P2logcosh(β)−¯¯¯¯P2Elog[1+tanh(β)tanh(h1)tanh(h2)]

(11)

+Elog{eBL∏i=1[1+tanh(β)tanh(hi)]

+e−BL∏i=1[1−tanh(β)tanh(hi)]},

where L has distribution Pl and is independent of the
“cavity fields”hi that are i.i.d. copies of the fixed point
h∗ of Lemma 2.1.
Also, ϕ(β,B)=ϕ(β,−B)
and ϕ(β,0) is the limit of ϕ(β,B) as B→0.

The proof of Theorem 2.2 is based on two steps:
{longlist}[(a)]
(a)Reduce the computation of
ϕn(β,B)=1nlogZn(β,B) to computing expectations
of local (in Gn)
quantities with respect to the Boltzmann measure (1).
This is achieved by noticing that the derivative of ϕn(β,B)
with respect to β is a sum of such expectations.
(b)Show that expectations of local quantities on Gn are well
approximated by the same expectations with respect to an Ising model on the
associated tree T(P,ρ,t) (for t and n large).
This is proved by showing that, on such a tree, local expectations are
insensitive to boundary conditions that dominate stochastically
free boundaries.
The theorem then follows by monotonicity arguments.
The key step is of course the last one. A stronger requirement would be that
these
expectation values are insensitive to any boundary condition,
which would coincide with uniqueness of the Gibbs measure on
T(P,ρ,∞). Such a requirement would allow
for an elementary proof, but holds only at “high” temperature,
β≤βc.Indeed, insensitivity to positive boundary conditions is proved
in Section 4 for
the following collection of
trees of conditionally independent (and of bounded
average) offspring numbers.
{definition}
An infinite tree T rooted at the vertex \o
is called conditionally independent if
for each integer k≥0, conditional on
the subtree T(k) of the first k generations of T,
the number of offspring Δj for j∈∂T(k)
are independent of each other, where
∂T(k) denotes the set of vertices at generation k.
We further assume that the [conditional on T(k)]
first moments of Δj are uniformly bounded by a given
nonrandom finite constant Δ.Beyond the random tree T(P,ρ,∞),
these include deterministic trees with bounded degrees
and certain multi-type branching processes (such as
random bipartite trees and percolation clusters
on deterministic trees of bounded degree).
Consequently, Theorem 2.2
extends to any uniformly sparse graph sequence
that converge locally to a random tree T of
the form of Definition 2.2 except that
the formula ϕ(β,B) is in general more
involved than the one given in (11).
For example, such an extension allows one to handle
uniformly random bipartite graphs with different
degree distributions Pk and Qk for the two types of vertices.While we refrain from formalizing and proving such
generalizations, we note in passing that our derivation
of the formula (11) implicitly
uses the fact that T(P,ρ,∞) possesses the
involution invariance of [2]. As pointed out
in [1], every local limit of finite graphs
must have the involution invariance property (which
clearly not every conditionally independent tree has).

2.3 Algorithmic implications

The free entropy density is not the only quantity that can be
characterized for Ising models on locally tree-like graphs.
Indeed local marginals can be efficiently computed with good accuracy.
The basic idea is to solve a set of mean field equations iteratively.
These are known as Bethe–Peierls or cavity equations and the
corresponding algorithm is referred to as “belief propagation” (BP).

More precisely, associate to each directed edge in the graph i→j,
with (i,j)∈G, a distribution νi→j(xi) over xi∈{+1,−1}.
In the computer science literature these distributions are referred
to as “messages.” They are updated as follows:

ν(t+1)i→j(xi)=1z(t)i→jeBxi∏l∈∂i∖j∑xleβxixlν(t)l→i(xl).

(12)

The initial conditions ν(0)i→j(⋅) may be taken
to be
uniform or chosen according to some heuristic. We will say that the
initial condition is positive if ν(0)i→j(+1)≥ν(0)i→j(−1) for each of these messages.

Our next result concerns the uniform exponential
convergence of the BP iteration to the same fixed
point of (12), irrespective of its
positive initial condition.
Here and below, we denote by ∥p−q∥TV the total
variation distance
between distributions p and q.

Theorem 2.3

Assume β≥0, B>0 and G is a graph of finite
maximal degree Δ. Then,
there exists A=A(β,B,Δ) finite,
λ=λ(β,B,Δ)>0
and a fixed point {ν∗i→j} of the BP iteration
(12) such that for any positive initial condition
{ν(0)l→k} and all t≥0,

sup(i,j)∈E∥∥ν(t)i→j−ν∗i→j∥∥TV≤Aexp(−λt).

(13)

For i∗∈V let U≡Bi∗(r) be the ball of radius r around i∗ in G,
denoting by EU its edge set, by ∂U its border
(i.e., the set of its vertices at distance r from i∗),
and for each i∈∂U let
j(i) denote any one fixed neighbor of i in U.

Our next result shows that the probability distribution

νU(x––U)=1zUexp{β∑(i,j)∈EUxixj+B∑i∈U∖∂Uxi}∏i∈∂Uν∗i→j(i)(xi),

(14)

with {ν∗i→j(⋅)} the fixed point of the BP iteration
per Theorem 2.3,
is a good approximation for the marginal μU(⋅)
of variables x––U≡{xi\dvtxi∈U} under the Ising model
(1).

Theorem 2.4

Assume β≥0, B>0 and G is a graph of finite maximal
degree Δ.
Then, there exist
finite c=c(β,B,Δ) and λ=λ(β,B,Δ)>0 such
that for any i∗∈G and U=Bi∗(r), if Bi∗(t)
is a
tree then

∥μU−νU∥TV≤exp{cr+1−λ(t−r)}.

(15)

2.4 Examples

Random regular graphs

Let Gn be a uniformly random graph
with degree
k. As n→∞, the sequence {Gn} is obviously uniformly sparse,
and converges locally almost surely to the rooted infinite tree of
degree k at every vertex.
Therefore, in this case Theorem 2.2 applies
with Pk=1 and Pi=0 for i≠k.
The distributional recursion
(8) then evolves with a deterministic sequence h(t)
recovering the result of [11].

Erdös–Renyi graphs

Let Gn be a uniformly random graph
with m=nγ edges over n vertices. The sequence {Gn} converges
locally almost surely to a Galton–Watson tree with Poisson offspring
distribution of mean 2γ. This corresponds to taking
Pk=(2γ)ke−2γ/k!.
The same happens to classical variants of this ensemble. For instance,
one can add an edge independently for each pair (i,j) with probability
2γ/n, or consider a multi-graph with Poisson(2γ/n)
edges between each pair (i,j).

The sequence
{Gn} is with probability one uniformly sparse in
each of these cases. Thus,
Theorem 2.2 extends the results of
[7] to arbitrary nonzero temperature and magnetic field.

Arbitrary degree distribution

Let P be a distribution with
finite second moment and Gn a uniformly random graph with degree
distribution P
(more precisely, we set
the number of vertices of degree k≥1 to
⌊nPk⌋, adding one for k=1 if needed for
an even sum of degrees). Then, {Gn} is uniformly sparse and
with probability one it
converges locally to T(P,ρ,∞).
The same happens
if Gn is drawn according to the so-called configuration model
(cf. [4]).

We review here the notations and a couple of classical tools
we use throughout this paper. To this end, when
proving our results it is useful to allow for vertex-dependent
magnetic fields Bi, that is, to replace the
basic model (1) by

μ(x––)=1Z(β,B––)exp{β∑(i,j)∈Exixj+∑i∈VBixi}.

(16)

Given U⊆V, we denote by (+)U [respectively, (−)U]
the vector {xi=+1, i∈U} [respectively,
{xi=−1,i∈U}], dropping the subscript U whenever
clear from the context. Further, we use x––U⪯x––′U
when two real-valued vectors x–– and x––′ are
such that
xi≤x′i for all i∈U and say that
a distribution ρU(⋅) over RU is
dominated by a distribution ρ′U(⋅) over
this set (denoted ρU⪯ρ′U),
if the two distributions can be coupled so
that x––U⪯x––′U for any
pair (x––U,x––′U) drawn from this coupling.
Finally, we use throughout the shorthand
⟨ν,f⟩=∑xf(x)ν(x)
for a distribution ν and function f on the same finite set, or
⟨f⟩ when ν is clear from the context.

The first classical result we need is Griffiths inequality (see
[19], Theorem IV.1.21).

Theorem 3.1

Consider two Ising models μ(⋅) and μ′(⋅)
on graphs G=(V,E) and G′=(V,E′),
inverse temperatures β and β′, and magnetic fields {Bi}
and {B′i}, respectively.
If E⊆E′, β≤β′ and 0≤Bi≤B′i
for all i∈V, then
0≤⟨μ,∏i∈Uxi⟩≤⟨μ′,∏i∈Uxi⟩
for any U⊆V.

The second classical result we use is the GHS inequality (see [12])
about the effect of the magnetic field B–– on the local
magnetizations
at various vertices.

Theorem 3.2 ((Griffiths, Hurst, Sherman))

Let β≥0 and for B––={Bi\dvtxi∈V}, denote by
mj(B––)≡μ({x––\dvtxxj=+1})−μ({x––\dvtxxj=−1})
the local magnetization at vertex j in
the Ising model (16).
If Bi≥0 for all i∈V, then for any three
vertices j,k,l∈V (not necessarily distinct),

∂2mj(B––)∂Bk∂Bl≤0.

(17)

Finally, we need the following elementary inequality:

Lemma 3.3

For any function f\dvtxX↦[0,fmax]
and distributions ν, ν′ on the finite
set X such that ν(f>0)>0 and ν′(f>0)>0,

∑x∣∣∣ν(x)f(x)⟨ν,f⟩−ν′(x)f(x)⟨ν′,f⟩∣∣∣≤3fmaxmax(⟨ν,f⟩,⟨ν′,f⟩)∥ν−ν′∥TV.

(18)

In particular, if 0<fmin≤f(x), then the
right-hand side is bounded by (3fmax/fmin)∥ν−ν′∥TV.

{pf}

Assuming without loss of generality that ⟨ν′,f⟩≥⟨ν,f⟩>0, the left-hand side of (18) can be bounded as

We prove in this section certain facts about Ising models on trees
which are of independent interest and as a byproduct we
deduce Lemma 2.1 and the theorems
of Section 2.3.
In doing so, recall that
for each ℓ≥1 the Ising models
on T(ℓ) with free and plus boundary conditions are

μℓ,0(x––)

≡

1Zℓ,0exp{β∑(ij)∈T(ℓ)xixj+∑i∈T(ℓ)Bixi},

(19)

μℓ,+(x––)

≡

1Zℓ,+exp{β∑(ij)∈T(ℓ)xixj+∑i∈T(ℓ)Bixi}

(20)

×I(x––∂T(ℓ)=(+)∂T(ℓ)).

Equivalently μℓ,0 is the Ising model (16)
on T(ℓ)
with magnetic fields {Bi} and μℓ,+ is the modified Ising
model corresponding to the limit Bi↑+∞
for all i∈∂T(ℓ). To simplify our notation we
denote such limits hereafter simply by setting Bi=+∞
and use μℓ for statements that apply to both free
and plus boundary conditions.

We start with the following simple but useful observation.

Lemma 4.1

For a subtree U of a finite tree T let
∂∗U denote the subset of
vertices of U connected by an edge
to W≡T∖U and
for each u∈∂∗U let
⟨xu⟩W denote the root magnetization of the Ising model
on the maximal subtree Tu of W∪{u}
rooted at u. The marginal on U of the Ising measure on T,
denoted μTU is then an Ising measure on U with
magnetic field B′u=atanh(⟨xu⟩W)≥Bu for
u∈∂∗U and B′u=Bu for u∉∂∗U.

{pf}

Since U is a subtree
of the tree T, the subtrees
Tu for u∈∂∗U are disjoint.
Therefore, with ^μu(x––) denoting
the Ising model distribution for Tu
we have that

μTU(x––U)=1^Zf(x––U)∏u∈∂∗U^μu(xu)

(21)

for the Boltzmann weight

f(x––U)=exp{β∑(uv)∈Uxuxv+∑u∈U∖∂∗UBuxu}.

Further, xu∈{+1,−1} so for each u∈∂∗U
and some constants cu,

^μu(xu)=12(1+xu⟨xu⟩W)=cuexp(atanh(⟨xu⟩W)xu).

Embedding the normalization constants cu within ^Z
we thus conclude that μTU is an Ising measure on U
with the stated magnetic field B′u. Finally, comparing
the root magnetization for Tu with that for
{u} we have by Griffiths
inequality that ⟨xu⟩W≥tanh(Bu), as claimed.

Theorem 4.2

Suppose T is a conditionally independent infinite tree
of average offspring numbers bounded by Δ,
as in Definition 2.2.
For 0<Bmin≤Bmax, βmax and Δ finite,
there exist M=M(βmax,Bmin,Δ) and
C=C(βmax,Bmax) finite such that if
Bi≤Bmax for all i∈T(r−1) and
Bi≥Bmin for all i∈T(ℓ), ℓ>r, then

E∥μℓ,+U−μℓ,0U∥TV≤δ(ℓ−r)E{C|T(r)|}

(22)

for δ(t)=M/t, all U⊆T(r) and β≤βmax.

{pf}

Fixing ℓ>r
it suffices to consider U=T(r) [for which
the left-hand side of (22) is maximal].
For this U and T=T(ℓ) we have that
∂∗U=∂T(r) and
U∖∂∗U=T(r−1), where in
this case the Boltzmann weight f(⋅) in (21)
is bounded above by fmax=c|T(r)| and
below by fmin=1/fmax
for c=exp(βmax+Bmax). Further, the
plus and free boundary conditions then differ in
(21) by having the corresponding
boundary conditions at generation ℓ−r of
each subtree Tu, which we
distinguish by using
^μ+/0u(xu) instead of
^μu(xu). Since
the total variation distance between two product measures
is at most the sum of the distance between their marginals,
upon applying Lemma 3.3 we deduce
from (21) that

∥∥μℓ,+T(r)−μℓ,0T(r)∥∥TV≤32c2|T(r)|∑i∈∂T(r)|^μ+i(xi=1)−^μ0i(xi=1)|.

By our assumptions, conditional on U=T(r),
the subtrees Ti of T=T(ℓ)
denoted hereafter also by Ti are
for i∈∂T(r) independent of each other.
Further, 2^μ+/0i(xi=1)−1
is precisely the magnetization of their root vertex
under plus/free boundary conditions at generation
ℓ−r. Thus, taking C=ec2
(and using the inequality
y≤ey), it suffices to show that the magnetizations
mℓ,+/0(B––)=⟨μℓ,+/0,x\o⟩
at the root of any such conditionally independent infinite tree
T satisfy
E{mℓ,+(B––)−mℓ,0(B––)}≤Mℓ,
for some M=M(βmax,Bmin,Δ) finite,
all β≤βmax and
ℓ≥1, where we have removed the absolute value since
mℓ,+(B––)≥mℓ,0(B––) by
Griffiths inequality.
For greater convenience of the reader, this fact is
proved in the next lemma.

Lemma 4.3

Suppose T is a conditionally independent infinite tree
of average offspring numbers bounded by Δ.
For 0<Bmin≤Bmax, βmax and Δ finite,
there exist M=M(βmax,Bmin,Δ) such that

E{mℓ,+(B––)−mℓ,0(B––)}≤Mℓ,

(23)

where mℓ,+/0(B––)=⟨μℓ,+/0,x\o⟩
are the root magnetizations under + and free boundary
condition on T.

{pf}

Note that (23) trivially holds for β=0
[in which case μℓ,+(x\o)=μℓ,0(x\o)].
Assuming hereafter that β>0 we
proceed to prove (23)
when each vertex of T(ℓ−1) has a
nonzero offspring number.
To this end,
for H––={Hi∈R\dvtxi∈∂T(k)} let

μk,H––(x––)

≡

1Zk,0exp{β∑(ij)∈T(k)xixj+∑i∈T(k)Bixi+∑i∈∂T(k)Hixi}

and denote by mk(B––,H––) the corresponding
root magnetization.
Writing H instead of H–– for constant magnetic field
on the leave nodes, that is, when Hi=H
for each i∈∂T(k), we note that
mk,+(B––)=mk(B––,∞)
and mk,0(B––)=mk(B––,0).
Further, applying Lemma 4.1 for
the subtree T(k−1) of T(k)
we represent mk(B––,∞) as the root magnetization
mk−1(B––′,0) on T(k−1) where
B′i=Bi+βΔi for i∈∂T(k−1) and
B′i=Bi for all other i. Consequently,

mk(B––,∞)=mk−1(B––,{βΔi}).

(24)

Recall that if
∂2g∂2zi≤0 for i=1,…,s,
then applying Jensen’s inequality one variable at a time we
have that
Eg(Z1,…,Zs)≤g(EZ1,…,EZs)
for any independent random variables Z1,…,Zs.
By the GHS inequality, this is the case for
H––↦mk−1(B––,H––), hence
with Ek denoting
the conditional on T(k)
expectation over the independent offspring numbers
Δi for i∈∂T(k), we deduce that

Ek−1mk(B––,∞)≤mk−1(B––,{βEk−1Δi})≤mk−1(B––,βΔ),

(25)

where the last inequality is a consequence of
Griffiths inequality and our assumption that
EtΔi≤Δ for any i∈∂T(t)
and all t≥0.
Since each i∈∂T(k−1) has at least
one offspring whose magnetic field is at least Bmin,
it follows by Griffiths inequality that
mk,0(B––) is bounded below by the
magnetization at the root of the subtree T of
T(k) where Δi=1 for all
i∈∂T(k−1)
and Bi=Bmin for all i∈∂T(k).
Applying Lemma 4.1
for T and U=T(k−1), the root magnetization for
the Ising distribution on T turns out to be
precisely mk−1(B––,ξ) for ξ=ξ(β,Bmin)>0
of (9).
Thus, one more application of Griffiths inequality yields that

mk(B––,0)≥mk−1(B––,ξ)≥mk−1(B––,0).

(26)

Next note that ξ(β,B)≤β≤βΔ and by
GHS inequality H↦mk−1(B––,H) is concave. Hence,

We have seen in (26) that
k↦mk,0(B––) is nondecreasing whereas
from (24) and Griffiths inequality we have
that k↦mk,+(B––) is nonincreasing.
With magnetization bounded above by one, we thus get
upon summing the preceding inequalities for k=1,…,ℓ that

Considering now the general case where the
infinite tree T has vertices (other than the root)
of degree one, let T∗(ℓ) denote the “backbone”
of T(ℓ), that is, the subtree
induced by vertices along self-avoiding paths between \o and
∂T(ℓ). Taking U=T∗(ℓ) as the subtree
of T=T(ℓ) in Lemma 4.1,
note that for each u∈∂∗U the
subtree Tu contains no vertex from ∂T(ℓ).
Consequently, the marginal measures μℓ,+/0U are Ising
measures on U with the same magnetic fields
B′i≥Bi≥Bmin outside ∂T(ℓ).
Thus, with mℓ,+/0∗(B––)
denoting the corresponding magnetizations at the root
for T∗(ℓ), we deduce that
mℓ,+/0(B––)=mℓ,+/0∗(B––′) where
B′i≥Bi≥Bmin for all i. By definition every
vertex of T∗(ℓ−1) has a nonzero
offspring number and with B′i≥Bmin, the required bound

E{mℓ,+(B––)−mℓ,0(B––)}=E{mℓ,+∗(B––′)−mℓ,0∗(B––′)}≤Mℓ

follows by the preceding argument, since
T∗(ℓ) is a conditionally independent tree
whose offspring numbers Δ∗i≥1 do not exceed those of
T(ℓ). Indeed, for k=0,1,…,ℓ−1,
given T∗(k)
the offspring numbers at i∈∂T∗(k)
are independent of each other [with probability of
{Δ∗i=s} proportional to the sum over t≥0
of the product of the probability
of {Δi=s+t} and that of precisely s out of the s+t
offspring of i in T(ℓ) having a line of descendants
that survives additional ℓ−k−1 generations, for s≥1].

Simon’s inequality (see [25], Theorem 2.1)
allows one to bound the (centered) two point
correlation functions in ferromagnetic
Ising models with zero magnetic field.
We provide next its generalization to arbitrary magnetic
field, in the case of Ising models on trees.

Lemma 4.4

If edge (i,j) is on the
unique path from \o to k∈T(ℓ), with
j a descendant of i∈∂T(t), t≥0, then

⟨x\o;xk⟩(ℓ)\o≤cosh2(2β+Bi)⟨x\o;xi⟩(t)\o⟨xj;xk⟩(ℓ)j,

(28)

where ⟨⋅⟩(r)i denotes the expectation
with respect to
the Ising distribution ^μi(⋅)
on the subtree Ti of i and all its descendants in
T(r) and ⟨x;y⟩≡⟨xy⟩−⟨x⟩⟨y⟩ denotes the
centered two point correlation function.

{pf}

It is not hard to
check that if x,y,z are {+1,−1}-valued random variables
with x and z conditionally independent given y, then

⟨x;z⟩=⟨x;y⟩⟨y;z⟩1−⟨y⟩2.

(29)

In particular, under μℓ,0 the random variables
x\o and xk are conditionally independent given
y=xi with