Abstract

We consider the problem of inferring the interactions between a set of N
binary variables from the knowledge of their frequencies and pairwise
correlations. The inference framework is based on the Hopfield model, a special
case of the Ising model where the interaction matrix is defined through a set
of patterns in the variable space, and is of rank much smaller than N. We show
that Maximum Lik elihood inference is deeply related to Principal Component
Analysis when the amp litude of the pattern components, xi, is negligible
compared to N^1/2. Using techniques from statistical mechanics, we calculate
the corrections to the patterns to the first order in xi/N^1/2. We stress that
it is important to generalize the Hopfield model and include both attractive
and repulsive patterns, to correctly infer networks with sparse and strong
interactions. We present a simple geometrical criterion to decide how many
attractive and repulsive patterns should be considered as a function of the
sampling noise. We moreover discuss how many sampled configurations are
required for a good inference, as a function of the system size, N and of the
amplitude, xi. The inference approach is illustrated on synthetic and
biological data.Comment: Physical Review E: Statistical, Nonlinear, and Soft Matter Physics
(2011) to appea