My research interests span a range of topics in statistical machine learning
and its applications, with a particular emphasis on
multi-view and semi-supervised learning algorithms. The common thread
throughout my work is development and analysis of novel machine learning techniques tailored to the
specific real-world problems in systems biology, automated reasoning, and information retrieval domains.

Co-regularized sparse-group lasso.
We introduce the co-regularized sparse-group lasso algorithm: a technique that allows the incorporation of
auxiliary information into the learning task in terms of groups of predictors and the relationship between
those groups. The proposed cost function requires related groups of predictors to provide similar contributions
to the final response, and thus, guides the feature selection process using auxiliary information.
Our algorithm is particularly suitable for a wide range of biological applications where good predictive
performance is required and, in addition to that, it is also important to retrieve all relevant predictors
so as to deepen the understanding of the underlying biological process.

Unsupervised multi-view feature selection via co-regularization.
Existing unsupervised feature selection algorithms are designed to
extract the most relevant subset of features that can facilitate clustering and
interpretation of the obtained results. However, these techniques are not
applicable in many real-world scenarios where one has an access to datasets consisting of multiple
views/representations (e.g. various omics profiles, medical text records coupled with FMRI images, etc).
Proposed method can leverage information from these different
views and produce more robust and accurate results in comparison to traditional methods.

KeCo: kernel-based online co-agreement algorithm.
This online algorithm uses a co-agreement strategy
to take into account unlabelled data and to improve classification performance.
Unlike the standard online methods it is naturally applicable to many real-world situations where
data is available in multiple representations. In addition, our online algorithm allows learning non-linear
relations in the data via kernel functions.

Personalized microbial network inference via co-regularized spectral clustering.
Based on the results of co-regularized spectral clustering this code visualizes two groups of individuals with
different topology of their microbial interaction network. The results of microbial network inference
suggest that niche-wise interactions are different in these two groups. The network visualization is implemented in Python and in Matlab.

Online co-regularized algorithm.
The proposed algorithm is particularly applicable to learning tasks where large amounts of (unlabeled) data are available for training.
The algorithm co-regularizes prediction functions on unlabeled data points and leads to improved performance in comparison to
several baseline methods on UCI benchmarks and a real world natural language processing datasets.

Probabilistic preference learner/ranker - ProbRank.
The algorithm can learn a ranking function based on pairwise comparison data, that is, data about the ranking function values is provided in terms of pairwise comparisons at the given locations. This is accomplished in two ways: a) Approximating the marginal likelihood using expectation propagation and carrying out maximum likelihood procedure on the hyper-parameters. In this case the square exponential covariance function is used.
b) Considering ranking as a regression with Gaussian noise and Gaussian processes prior, given the score differences.

Learning2Reason. Funded by NWO (Ongoing).
The general aim of this project is to develop machine learning algorithms suitable for mining
large corpora of formally expressed knowledge that are available in the fields of formal mathematics and software verification.