In computational biology, it is common to represent domain knowledge using graphs. Frequently there exist multiple graphs for the same set of nodes, representing information from different sources, and no single graph is sufficient to predict class labels of unlabelled nodes reliably. One way to enhance reliability is to integrate multiple graphs, since individual graphs are partly independent and partly complementary to each other for prediction. In this chapter, we describe an algorithm to assign weights to multiple graphs within graph-based semi-supervised learning. Both predicting class labels and searching for weights for combining multiple graphs are formulated into one convex optimization problem. The graph-combining method is applied to functional class prediction of yeast proteins.When compared with individual graphs, the combined graph with optimized weights performs significantly better than any single graph.When compared with the semidefinite programming-based support vector machine (SDP/SVM), it shows comparable accuracy in a remarkably short time. Compared with a combined graph with equal-valued weights, our method could select important graphs without loss of accuracy, which implies the desirable property of integration with selectivity.

Many real-world machine learning problems are situated on finite discrete sets,
including dimensionality reduction, clustering, and transductive inference. A variety
of approaches for learning from finite sets has been proposed from different
motivations and for different problems. In most of those approaches, a finite set
is modeled as a graph, in which the edges encode pairwise relationships among the
objects in the set. Consequently many concepts and methods from graph theory are
adopted. In particular, the graph Laplacian is widely used.
In this chapter we present a systemic framework for learning from a finite set
represented as a graph. We develop discrete analogues of a number of differential
operators, and then construct a discrete analogue of classical regularization theory
based on those discrete differential operators. The graph Laplacian based approaches
are special cases of this general discrete regularization framework. An important
thing implied in this framework is that we have a wide choices of regularization on
graph in addition to the widely-used graph Laplacian based one.

The annual Neural Information Processing Systems (NIPS) conference is the flagship meeting on neural computation. It draws a diverse group of attendees--physicists, neuroscientists, mathematicians, statisticians, and computer scientists. The presentations are interdisciplinary, with contributions in algorithms, learning theory, cognitive science, neuroscience, brain imaging, vision, speech and signal processing, reinforcement learning and control, emerging technologies, and applications. Only twenty-five percent of the papers submitted are accepted for presentation at NIPS, so the quality is exceptionally high. This volume contains the papers presented at the December 2005 meeting, held in Vancouver.

This book constitutes the thoroughly refereed post-proceedings of the First PASCAL (pattern analysis, statistical modelling and computational learning) Machine Learning Challenges Workshop, MLCW 2005, held in Southampton, UK in April 2005.
The 25 revised full papers presented were carefully selected during two rounds of reviewing and improvement from about 50 submissions. The papers reflect the concepts of three challenges dealt with in the workshop: finding an assessment base on the uncertainty of predictions using classical statistics, Bayesian inference, and statistical learning theory; the second challenge was to recognize objects from a number of visual object classes in realistic scenes; the third challenge of recognizing textual entailment addresses semantic analysis of language to form a generic framework for applied semantic inference in text understanding.

Our goal for the competition (feature selection competition NIPS 2003) was to evaluate the usefulness of simple
machine learning techniques. We decided to use the correlation criteria as a feature selection method and Support Vector Machines for the classification part. Here we explain how we chose the regularization parameter C of the SVM, how we determined the kernel parameter and how we estimated the number of features used for each data set. All analyzes were carried out on the
training sets of the competition data. We choose the data set Arcene as an example
to explain the approach step by step.
In our view the point of this competition was the construction of a well performing
classifier rather than the systematic analysis of a specific approach. This is why our
search for the best classifier was only guided by the described methods and that we
deviated from the road map at several occasions.
All calculations were done with the software Spider [2004].

Embedded methods are a relatively new approach to feature selection. Unlike filter methods, which do not incorporate learning, and wrapper approaches, which can be used with arbitrary classifiers, in embedded methods the features selection part can not be separated from the learning part.
Existing embedded methods are reviewed based on a unifying mathematical framework.

Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems