Nanorobots are untethered structures of sub-micron size that can be controlled in a non-trivial way. Such nanoscale robotic agents are envisioned to revolutionize medicine by enabling minimally invasive diagnostic and therapeutic procedures. To be useful, nanorobots must be operated in complex biological fluids and tissues, which are often difficult to penetrate. In this chapter, we first discuss potential medical applications of motile nanorobots. We briefly present the challenges related to swimming at such small scales and we survey the rheological properties of some biological fluids and tissues. We then review recent experimental results in the development of nanorobots and in particular their design, fabrication, actuation, and propulsion in complex biological fluids and tissues. Recent work shows that their nanoscale dimension is a clear asset for operation in biological tissues, since many biological tissues consist of networks of macromolecules that prevent the passage of larger micron-scale structures, but contain dynamic pores through which nanorobots can move.

Haptics is an interdisciplinary field that seeks to both understand and engineer touch-based interaction. Although a wide range of systems and applications are being investigated, haptics researchers often concentrate on perception and manipulation through the human hand.
A haptic interface is a mechatronic system that modulates the physical interaction between a human and his or her tangible surroundings. Haptic interfaces typically involve mechanical, electrical, and computational layers that work together to sense user motions or forces, quickly process these inputs with other information, and physically respond by actuating elements of the user’s surroundings, thereby enabling him or her to act on and feel a remote and/or virtual environment.

Recent approaches to independent component analysis have used kernel
independence measures to obtain very good performance in ICA, particularly
in areas where classical methods experience difficulty (for instance,
sources with near-zero kurtosis). In this chapter, we compare two efficient
extensions of these methods for large-scale problems: random subsampling
of entries in the Gram matrices used in defining the independence
measures, and incomplete Cholesky decomposition of these matrices.
We derive closed-form, efficiently computable approximations for the
gradients of these measures, and compare their performance on ICA using
both artificial and music data. We show that kernel ICA can scale up to much larger
problems than yet attempted, and that incomplete Cholesky decomposition
performs better than random sampling.

Most literature on Support Vector Machines (SVMs) concentrate on
the dual optimization problem. In this paper, we would like to point out
that the primal problem can also be solved efficiently, both for linear
and non-linear SVMs, and that there is no reason to ignore this possibility.
On the contrary, from the primal point of view new families of algorithms for
large scale SVM training can be investigated.

A wealth of computationally efficient approximation methods for Gaussian process regression have been recently proposed. We give a unifying overview of sparse approximations, following Quiñonero-Candela and Rasmussen (2005), and a brief review of approximate matrix-vector multiplication methods.

Abstract. This paper considers kernels invariant to translation, rotation and dilation. We show that no non-trivial
positive definite (p.d.) kernels exist which are radial and dilation invariant, only conditionally positive definite
(c.p.d.) ones. Accordingly, we discuss the c.p.d. case and provide some novel analysis, including an elementary
derivation of a c.p.d. representer theorem. On the practical side, we give a support vector machine (s.v.m.) algorithm
for arbitrary c.p.d. kernels. For the thin-plate kernel this leads to a classifier with only one parameter (the
amount of regularisation), which we demonstrate to be as effective as an s.v.m. with the Gaussian kernel, even
though the Gaussian involves a second parameter (the length scale).

Convex learning algorithms, such as Support Vector Machines (SVMs), are often
seen as highly desirable because they offer strong practical properties and are
amenable to theoretical analysis. However, in this work we show how nonconvexity
can provide scalability advantages over convexity. We show how concave-convex
programming can be applied to produce (i) faster SVMs where training errors are
no longer support vectors, and (ii) much faster Transductive SVMs.

(TR-07-47), University of Texas, Austin, TX, USA, September 2007 (techreport)

Abstract

Several important machine learning problems can be modeled and solved via semidefinite programs. Often, researchers invoke off-the-shelf software for the associated optimization, which can be inappropriate for many applications due to computational and storage requirements. In this paper, we introduce the use of convex perturbations for semidefinite programs (SDPs). Using a particular perturbation function, we arrive
at an algorithm for SDPs that has several advantages over existing techniques: a) it is simple, requiring only a few lines of MATLAB, b) it is a first-order method which makes it scalable, c) it can easily exploit the structure of a particular SDP to gain efficiency (e.g., when the constraint matrices are low-rank). We demonstrate on several machine learning applications that the proposed algorithm is effective in finding fast approximations to large-scale SDPs.

Most existing sparse Gaussian process (g.p.) models seek computational advantages by basing their
computations on a set of m basis functions that are the covariance function of the g.p. with one of its two inputs
fixed. We generalise this for the case of Gaussian covariance function, by basing our computations on m Gaussian
basis functions with arbitrary diagonal covariance matrices (or length scales). For a fixed number of basis
functions and any given criteria, this additional flexibility permits approximations no worse and typically better
than was previously possible. Although we focus on g.p. regression, the central idea is applicable to all kernel
based algorithms, such as the support vector machine. We perform gradient based optimisation of the marginal
likelihood, which costs O(m2n) time where n is the number of data points, and compare the method to various
other sparse g.p. methods. Our approach outperforms the other methods, particularly for the case of very few basis
functions, i.e. a very high sparsity ratio.

Recent years have seen huge advances in object recognition from images. Recognition rates beyond 95% are the rule rather than the exception on many datasets. However, most state-of-the-art methods can only decide if an object is present or not. They are not able to provide information on the object location or extent within in the image.
We report on a simple yet powerful scheme that extends many existing recognition methods to also perform localization of object bounding boxes. This is achieved by maximizing the classification score over all possible subrectangles in the image. Despite the impression that this would be computationally intractable, we show that in many situations efficient algorithms exist which solve a generalized maximum subrectangle problem.
We show how our method is applicable to a variety object detection frameworks and demonstrate its performance by applying it to the popular bag of visual words model, achieving competitive results on the PASCAL VOC 2006 dataset.

Assume we are given a sample of points from some underlying
distribution which contains several distinct clusters. Our goal is
to construct a neighborhood graph on the sample points such that
clusters are ``identified&lsquo;&lsquo;: that is, the subgraph induced by points
from the same cluster is connected, while subgraphs corresponding to
different clusters are not connected to each other. We derive bounds
on the probability that cluster identification is successful, and
use them to predict ``optimal&lsquo;&lsquo; values of k for the mutual and
symmetric k-nearest-neighbor graphs. We point out different
properties of the mutual and symmetric nearest-neighbor graphs
related to the cluster identification problem.

We describe two related models to cluster multidimensional time-series under the assumption of an underlying linear Gaussian dynamical process. In the first model, times-series are assigned to the same cluster when they show global similarity in their dynamics, while in the second model times-series are assigned to the same cluster when they show simultaneous similarity. Both models are based on Dirichlet Mixtures of Bayesian Linear Gaussian State-Space Models in order to (semi) automatically determine an appropriate number of components in the mixture, and to additionally bias the components to a parsimonious parameterization. The resulting models are formally intractable and to deal with this we describe a deterministic approximation based on a novel implementation of Variational Bayes.

This paper presents a fully automated algorithm for reconstructing a textured 3D model of a face from a single photograph or a raw video stream. The algorithm is based on a combination of Support Vector Machines (SVMs) and a Morphable Model of 3D faces. After SVM face detection, individual facial features are detected using a novel regression-and classification-based approach, and probabilistically plausible configurations of features are selected to produce a list of candidates for several facial feature positions. In the next step, the configurations of feature points are evaluated using a novel criterion that is based on a Morphable Model and a
combination of linear projections. Finally, the feature points initialize a model-fitting procedure of the Morphable Model. The result is a high-resolution 3D surface model.

In this chapter we are concerned with the problem of reconstructing patterns from their representation in feature space, known as the pre-image problem. We review existing algorithms and propose a learning based approach. All algorithms are discussed regarding their usability and complexity and evaluated on an image denoising application.

This technical report describes a cute idea of how to create new policy search approaches. It directly relates to the Natural Actor-Critic methods but allows the derivation of one shot solutions. Future work may include the application to interesting problems.

In the past, computational motor control has been approached from at least two major frameworks: the dynamic systems approach and the viewpoint of optimal control. The dynamic system approach emphasizes motor control as a process of self-organization between an animal and its environment. Nonlinear differential equations that can model entrainment and synchronization behavior are among the most favorable tools of dynamic systems modelers. In contrast, optimal control approaches view motor control as the evolutionary or development result of a nervous system that tries to optimize rather general organizational principles, e.g., energy consumption or accurate task achievement. Optimal control theory is usually employed to develop appropriate theories. Interestingly, there is rather little interaction between dynamic systems and optimal control modelers as the two approaches follow rather different philosophies and are often viewed as diametrically opposing. In this paper, we develop a computational approach to motor control that offers a unifying modeling framework for both dynamic systems and optimal control approaches. In discussions of several behavioral experiments and some theoretical and robotics studies, we demonstrate how our computational ideas allow both the representation of self-organizing processes and the optimization of movement based on reward criteria. Our modeling framework is rather simple and general, and opens opportunities to revisit many previous modeling results from this novel unifying view.

We introduce a modified Kalman filter that performs robust, real-time outlier detection, without the need for manual parameter tuning by the user. Systems that rely on high quality sensory data (for instance, robotic systems) can be sensitive to data containing outliers. The standard Kalman filter is not robust to outliers, and other variations of the Kalman filter have been proposed to overcome this issue. However, these methods may require manual parameter tuning, use of heuristics or complicated parameter estimation procedures. Our Kalman filter uses a weighted least squares-like approach by introducing weights for each data sample. A data sample with a smaller weight has a weaker contribution when estimating the current time step?s state. Using an incremental variational Expectation-Maximization framework, we learn the weights and system dynamics. We evaluate our Kalman filter algorithm on data from a robotic dog.

Designing a Brain Computer Interface (BCI) system one can choose from a variety of features that
may be useful for classifying brain activity during a mental task. For the special case of classifying EEG signals we propose the usage of the state of the art feature selection algorithms Recursive Feature Elimination [3] and Zero-Norm Optimization [13] which are based on the training of Support Vector Machines (SVM) [11]. These algorithms can provide more accurate solutions than standard filter methods for feature selection [14].
We adapt the methods for the purpose of selecting EEG channels. For a motor imagery paradigm we
show that the number of used channels can be reduced significantly without increasing the classification error. The resulting best channels agree well with the expected underlying cortical activity patterns during the mental tasks.
Furthermore we show how time dependent task specific information can be visualized.

The Google search engine has had a huge success with its PageRank
web page ranking algorithm, which exploits global, rather than
local, hyperlink structure of the World Wide Web using random
walk. This algorithm can only be used for graph data, however.
Here we propose a simple universal ranking algorithm for vectorial
data, based on the exploration of the intrinsic global geometric
structure revealed by a huge amount of data. Experimental results
from image and text to bioinformatics illustrates the validity of
our algorithm.

A new method for performing a kernel principal component analysis is
proposed. By kernelizing the generalized Hebbian algorithm, one can
iteratively estimate the principal components in a reproducing
kernel Hilbert space with only linear order memory complexity. The
derivation of the method, a convergence proof, and preliminary
applications in image hyperresolution are presented. In addition,
we discuss the extension of the method to the online learning of
kernel principal components.

We consider the learning problem in the transductive setting. Given
a set of points of which only some are labeled, the goal is to
predict the label of the unlabeled points. A principled clue to
solve such a learning problem is the consistency assumption that a
classifying function should be sufficiently smooth with respect to
the structure revealed by these known labeled and unlabeled points. We present a simple
algorithm to obtain such a smooth solution. Our method yields encouraging experimental results on a
number of classification problems and demonstrates effective use of
unlabeled data.

The Wiener series is one of the standard methods to systematically
characterize the nonlinearity of a neural system. The classical
estimation method of the expansion coefficients via cross-correlation
suffers from severe problems that prevent its application to
high-dimensional and strongly nonlinear systems. We propose a new
estimation method based on regression in a reproducing kernel Hilbert
space that overcomes these problems. Numerical experiments show
performance advantages in terms of convergence, interpretability and
system size that can be handled.

A key tool in protein function discovery is the ability to rank databases of proteins given a query amino acid sequence. The most successful method so far is a web-based tool called PSI-BLAST which uses heuristic alignment of a profile built using the large unlabeled database. It has been shown that such use of global information via an unlabeled data improves over a local measure derived from a basic pairwise alignment such as performed by PSI-BLAST's predecessor, BLAST. In this article we
look at ways of leveraging techniques from the field of machine learning for the problem of ranking. We show how clustering and semi-supervised learning techniques, which aim to capture global structure in data, can significantly improve over PSI-BLAST.

Canonical correlation analysis (CCA) is a classical multivariate method concerned with describing linear dependencies between sets of variables. After a short exposition of the linear sample CCA problem and its analytical solution, the article proceeds with a detailed characterization of its geometry. Projection operators are used to illustrate the relations between canonical vectors and variates. The article then addresses the problem of CCA between spaces spanned by objects mapped into kernel feature spaces. An exact solution for this kernel canonical correlation (KCCA) problem is derived from a geometric point of view. It shows that the expansion coefficients of the canonical vectors in their respective feature space can be found by linear CCA in the basis induced by kernel principal component analysis. The effect of mappings into higher dimensional feature spaces is considered critically since it simplifies the CCA problem in general. Then two regularized variants of KCCA are discussed. Relations to other methods are illustrated, e.g., multicategory kernel Fisher discriminant analysis, kernel principal component regression and possible applications thereof in blind source separation.

We introduce two new functions, the kernel covariance (KC) and the kernel
mutual information (KMI), to measure the degree of independence of several
continuous random variables.
The former is guaranteed to be zero if and only if the random variables
are pairwise independent; the latter shares this property, and is in addition
an approximate upper bound on the mutual information, as measured near
independence, and is based on a kernel density estimate.
We show that Bach and Jordan‘s kernel generalised variance (KGV) is also
an upper bound on the same kernel density estimate, but is looser.
Finally, we suggest that the addition of a regularising term in the KGV
causes it to approach the KMI, which motivates the introduction of this
regularisation.
The performance of the KC and KMI is verified in the context of instantaneous
independent component analysis (ICA), by recovering both artificial and
real (musical) signals following linear mixing.

In this short note, building on ideas of M. Herbster [2] we propose a method for automatically tuning the
parameter of the FIXED-SHARE algorithm proposed by Herbster and
Warmuth [3] in the context of on-line learning with
shifting experts. We show that this can be done with a memory
requirement of $O(nT)$ and that the additional loss incurred by
the tuning is the same as the loss incurred for estimating the
parameter of a Bernoulli random variable.

Interactive Images are a natural extension of three recent developments: digital photography, interactive web pages, and browsable video. An interactive image is a multi-dimensional image, displayed two dimensions at a time (like a standard digital image), but with which a user can interact to browse through the other dimensions. One might consider a standard video sequence viewed with a video player as a simple interactive image with time as the third dimension. Interactive images are a generalization of this idea, in which the third (and greater) dimensions may be focus, exposure, white balance, saturation, and other parameters. Interaction is handled via a variety of modes including those we call ordinal, pixel-indexed, cumulative, and comprehensive. Through exploration of three novel forms of interactive images based on color, exposure, and focus, we will demonstrate the compelling nature of interactive images.

Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems