Abstract

The analysis of large and complex networks, or graphs, becomes increasingly important in many scientific areas such as Machine Learning, Social Network Analysis and Bioinformatics. One important type of question is a??Given two sets R and T of individuals in a graph with complete and missing knowledge, respectively, about a pre-defined property, which individuals in T are closest to R with respect to this property?a??. To answer this question, we can rank the individuals in T such that the individuals ranked highest are most likely to exhibit the property of interest. Several methods based on
weighted paths in the graph and Markov chain models have been proposed to solve this task. In this paper, we show that we can improve previously published approaches by rephrasing this problem as the task of property prediction in graph-structured data from positive
examples, the individuals in R, and unlabelled data, the individuals in T, and applying an inexpensive iterative neighbourhooda??s majority vote based prediction algorithm (a??iNMVa??) to this task. We evaluate our iNMV prediction algorithm and two previously proposed methods using Markov chains to rank individuals in a graph with respect to a pre-defined property of interest and a set R on three real world graphs in terms of ROC AUC statistic. iNMV obtains rankings that are either significantly better or not significantly worse than
the rankings obtained from the more complex Markov chain based algorithms, and at the same time achieves a reduction in run time complexity of one order of magnitude on large graphs.