Codes and Data

Codes

We are surrounded by social and information sharing networks, over which diffusions of information, events, virus, takes place constantly. We often observe that after some influential users adopt certain new product or idea, they actively influence the behaviors of their friends, which in turn makes more friends of friends adopt the product through word-of-mouth.
The specific questions we seek to address in this NIPS 2013 paper is to accurately estimate the number of follow-ups which can be triggered by a given set of earlier influential users, and then to identify a set of influential users, to whom we will give promotions, in order to trigger the largest expected number of follow-ups as soon as possible? These questions are interesting because, for instance, advertisers want to have an efficient and effective campaign for their new products.

Hidden Markov Models (HMMs) are important tools for modeling sequence data. However,
they are restricted to discrete latent states, and are largely restricted to Gaussian
and discrete observations. And, learning algorithms for HMMs have predominantly relied
on local search heuristics, with the exception of spectral methods such as those described
below. We propose a nonparametric HMM that extends traditional HMMs to structured and non-Gaussian continuous distributions. Furthermore, we derive a localminimum-
free kernel spectral algorithm for learning these HMMs. We apply our method
to robot vision data, slot car inertial sensor data and audio event classification data, and
show that in these applications, embedded HMMs exceed the previous state-of-the-art performance.

We introduce a kernel reweighted logistic regression
(KELLER) for reverse engineering the dynamic interactions between
genes based on their time series of expression values. We apply
the proposed method to estimate the latent sequence of temporal
rewiring networks of 588 genes involved in the developmental process
during the life cycle of Drosophila melanogaster. Our results offer the
first glimpse into the temporal evolution of gene networks in a living
organism during its full developmental course. Our results also show
that many genes exhibit distinctive functions at different stages along
the developmental cycle.

Elefant (Efficient Learning, Large-scale Inference, and Optimization
Toolkit) is a Python open source library for machine learning licensed
under the Mozilla Public License. The aim is to develop an open source
machine learning platform which will become the platform of choice for
prototyping and deploying machine learning algorithms.
This toolkit is the common platform for software development in the
machine learning team in NICTA. Not all the tools are currently released
but many can be found in the developers version with SVN access.

Feature selectors for unconventional data (such as string and graph label). A versitle framework for filtering features that employs the Hilbert-Schmidt Independence Criterion (HSIC) as a measure of dependence between the features and the labels. The key idea is that good features should maximise such dependence. Feature selection for various supervised learning problems (including classification and regression) is unified under this framework, and the solutions can be approximated using a backward-elimination algorithm. Written in Python.

Clustering with a metric on labels. A family of clustering algorithms based on the maximization of dependence between the input variables and their cluster labels, as expressed by the Hilbert-Schmidt Independence Criterion (HSIC). Under this framework, we unify the geometric, spectral, and statistical dependence views of clustering, and subsume many existing algorithms as special cases (e.g. k-means and spectral clustering). Distinctive to our framework is that kernels can also be applied on the labels, which endows them with a particular structure. Written in c and examples in Matlab

Dimensionality reduction with side information. Maximum variance unfolding (MVU) is an effective heuristic for dimensionality reduction. It produces a low-dimensional representation of the data by maximizing the variance of their embeddings while preserving the local distances of the original data. We show that MVU also optimizes a statistical dependence measure which aims to retain the identity of individual observations under the distance preserving constraints. This general view allows us to design “colored” variants of MVU, which produce low-dimensional representations for a given task, e.g. subject to class labels or other side information. This method is also called maximum unfolding via Hilbert-Schmidt Independence Criterion (MUHSIC) or maximum covariance unfolding (MCU). Written in a mix of Matlab and C.

Others

Data

Feature Selection

For the data file (data.txt), each row is a data point and each column is a feature. For the label file (y.txt), there are two cases. For binary data, there is one column in y, positive class 1 and negative class 0. For multiclass data, the number of columns in y is the same as the number of classes; if a data i is in class j then there is 1 in row i and column j, otherwise 0.