Tools

"... In a dynamic social or biological environment, interactions between the underlying actors can undergo large and systematic changes. Each actor can assume multiple roles and their degrees of affiliation to these roles can also exhibit rich temporal phenomena. We propose a state space mixed membership ..."

In a dynamic social or biological environment, interactions between the underlying actors can undergo large and systematic changes. Each actor can assume multiple roles and their degrees of affiliation to these roles can also exhibit rich temporal phenomena. We propose a state space mixed membership stochastic blockmodel which can track across time the evolving roles of the actors. We also derive an efficient variational inference procedure for our model, and apply it to the Enron email networks, and rewiring gene regulatory networks of yeast. In both cases, our model reveals interesting dynamical roles of the actors. 1.

...ion and re-organization of complex relational networks. Recently there is a surge of interest in applying latent space model for network modeling and analysis (Hoff et al., 2002; Li & McCallum, 2006; =-=Handcock et al., 2007-=-; Erosheva et al., 2004; Airoldi et al., 2008). However, an important aspect has not been addressed so far: none of these models considered the dynamic nature of the networks. Rather than having a sin...

by
Eric P. Xing, Wenjie Fu, Le Song
- SUBMITTED TO THE ANNALS OF APPLIED STATISTICS

"... In a dynamic social or biological environment, the interactions between the actors can undergo large and systematic changes. In this paper, we propose a model-based approach to analyze what we will refer to as the dynamic tomography of such time-evolving networks. Our approach offers an intuitive bu ..."

In a dynamic social or biological environment, the interactions between the actors can undergo large and systematic changes. In this paper, we propose a model-based approach to analyze what we will refer to as the dynamic tomography of such time-evolving networks. Our approach offers an intuitive but powerful tool to infer the semantic underpinnings of each actor, such as its social roles or biological functions, underlying the observed network topologies. Our model builds on earlier work on a mixed membership stochastic blockmodel for static networks, and the state-space model for tracking object trajectory. It overcomes a major limitation of many current network inference techniques, which assume that each actor plays a unique and invariant role that accounts for all its interactions with other actors; instead, our method models the role of each actor as a time-evolving mixed membership vector that allows actors to behave differently over time and carry out different roles/functions when interacting with different peers, which is closer to reality. We present an efficient algorithm for approximate inference and learning using our model; and we applied our model to analyze a social network between monks (i.e., the Sampson’s network), a dynamic email communication network between the Enron employees, and a rewiring gene interaction network of fruit fly collected during its full life cycle. In all cases, our model reveals interesting patterns of the dynamic roles of the actors.

"... Abstract. Consider the problem of estimating the entries of a large matrix, when the observed entries are noisy versions of a small ran-dom fraction of the original entries. This problem has received wide-spread attention in recent times, especially after the pioneering works of Emmanuel Candès and ..."

Abstract. Consider the problem of estimating the entries of a large matrix, when the observed entries are noisy versions of a small ran-dom fraction of the original entries. This problem has received wide-spread attention in recent times, especially after the pioneering works of Emmanuel Candès and collaborators. This paper introduces a sim-ple estimation procedure, called Universal Singular Value Thresholding (USVT), that works for any matrix that has ‘a little bit of structure’. Surprisingly, this simple estimator achieves the minimax error rate up to a constant factor. The method is applied to solve problems related to low rank matrix estimation, blockmodels, distance matrix completion, latent space models, positive definite matrix completion, graphon esti-mation, and generalized Bradley–Terry models for pairwise comparison. 1.

...[5, 57]. Note that distance matrices and stochastic blockmodels are both special cases of latent space models. There have been various attempts to estimate parameters in the latent space models (e.g. =-=[55, 53, 4]-=-). Almost all of these approaches rely on heuristic arguments and justification through simulations. The problem is that in addition to the vectors β1, . . . , βn, the function f itself is an unknown ...

"... Bayesian inference for exponential random graph models Exponential random graph models are extremely difficult models to handle from a statistical viewpoint, since their normalising constant, which depends on model parame-ters, is available only in very trivial cases. We show how inference can be ca ..."

Bayesian inference for exponential random graph models Exponential random graph models are extremely difficult models to handle from a statistical viewpoint, since their normalising constant, which depends on model parame-ters, is available only in very trivial cases. We show how inference can be carried out in a Bayesian framework using a MCMC algorithm, which circumvents the need to calculate the normalising constants. We use a population MCMC approach which accelerates con-vergence and improves mixing of the Markov chain. This approach improves performance with respect to the Monte Carlo maximum likelihood method of Geyer and Thompson (1992). 1

"... As the rapid development of all kinds of online databases, huge heterogeneous information networks thus derived are ubiquitous. Detecting evolutionary communities in these networks can help people better understand the structural evolution of the networks. However, most of the current community evol ..."

As the rapid development of all kinds of online databases, huge heterogeneous information networks thus derived are ubiquitous. Detecting evolutionary communities in these networks can help people better understand the structural evolution of the networks. However, most of the current community evolution analysis is based on the homogeneous networks, while a real community usually involves different types of objects in a heterogeneous network. For example, when referring to a research community, it contains a set of authors, a set of conferences or journals and a set of terms. In this paper, we study the problem of detecting evolutionary multi-typed communities defined as net-clusters in dynamic heterogeneous networks. A Dirichlet Process Mixture Model-based generative model is proposed to model the community generations. At each time stamp, a clustering of communities with the best cluster number that can best explain the current and historical networks are automatically detected. A Gibbs sampling-based inference algorithm is provided to inference the model. Also, the evolution structure can be read from the model, which can help users better understand the birth, split and death of communities. Experiments on two real datasets, namely DBLP and Delicious.com, have shown the effectiveness of the algorithm.

...s. The study of community detection problem is first on homogeneous networks, such as spectral clustering methods [16, 22, 23], modularity-based methods [13, 12], and probabilistic modelbased methods =-=[17, 8, 7, 1]-=-, and later to bipartite networks [26, 5], and recently on heterogeneous networks [18, 19]. In this paper, we will consider the heterogeneous networks with star network schema as in [19], which is a v...

"... Proportional representation by means of a single transferable vote (PR-STV) is the electoral system employed in Irish elections. In this system, voters rank some or all of the candidates in order of preference. A latent space model is proposed for these election data where both candidates and voters ..."

Proportional representation by means of a single transferable vote (PR-STV) is the electoral system employed in Irish elections. In this system, voters rank some or all of the candidates in order of preference. A latent space model is proposed for these election data where both candidates and voters are located in the same D-dimensional space. The locations are determined by the ranked preferences which are modeled using the Plackett-Luce model for rank data. Voter positions reflect their preferences while the candidate locations represent the global view of the candidates by the electorate. 1.

...used for the candidate and voter locations, yet a more structured prior on the voters could be employed — for example, a mixture of normals as was used in a social networks context by Handcock et al. =-=[18]-=- may provide a more suitable prior. Acknowledgments Both authors would like to thank Adrian Raftery and other members of the Working Group on Model-Based Clustering at the University of Washington, Se...

"... Many algorithms have been proposed for fitting network models with communities but most of them do not scale well to large networks, and often fail on sparse networks. Here we propose a new fast pseudo-likelihood method for fitting the stochastic block model for networks, as well as a variant that a ..."

Many algorithms have been proposed for fitting network models with communities but most of them do not scale well to large networks, and often fail on sparse networks. Here we propose a new fast pseudo-likelihood method for fitting the stochastic block model for networks, as well as a variant that allows for an arbitrary degree distribution by conditioning on degrees. We show that the algorithms perform well under a range of settings, including on very sparse networks, and illustrate on the example of a network of political blogs. We also propose spectral clustering with perturbations, a method of independent interest, which works well on sparse networks where regular spectral clustering fails, and use it to provide an initial value for pseudo-likelihood. We prove that pseudo-likelihood provides consistent estimates of the communities under a mild condition on the starting value, for the case of a block model with two balanced communities.

... include the popular stochastic block model [20], its extensions to include varying degree distributions within communities [22] and overlapping communities [2, 4], and various latent variable models =-=[17, 19]-=-. The stochastic block model is perhaps the most commonly used and best studied model for community detection. For a network with n nodes defined by its n× n adjacency matrix A, this model postulates ...