Resting state activity is brain activation that arises in the absence of any task, and is usually measured
in awake subjects during prolonged fMRI scanning sessions where the only instruction given is to
close the eyes and do nothing. It has been recognized in recent years that resting state activity is
implicated in a wide variety of brain function. While certain networks of brain areas have different
levels of activation at rest and during a task, there is nevertheless significant similarity between
activations in the two cases. This suggests that recordings of resting state activity can be used as
a source of unlabeled data to augment kernel canonical correlation analysis (KCCA) in a semisupervised
setting. We evaluate this setting empirically yielding three main results: (i) KCCA tends
to be improved by the use of Laplacian regularization even when no additional unlabeled data are
available, (ii) resting state data seem to have a similar marginal distribution to that recorded during
the execution of a visual processing task implying largely similar types of activation, and (iii) this
source of information can be broadly exploited to improve the robustness of empirical inference in
fMRI studies, an inherently data poor domain.

40(616.2), 40th Annual Meeting of the Society for Neuroscience (Neuroscience), November 2010 (poster)

Abstract

Functional correlates of Rhythms in the gamma band (30-100Hz) are observed in the mammalian brain with a large variety of functional correlates. Nevertheless, their functional role is still debated. One way to disentangle this issue is to go beyond usual correlation analysis and apply causality measures that quantify the directed interactions between the gamma rhythms and other aspects of neural activity. These measures can be further compared with other aspects of neurophysicological signals to find markers of neural interactions.
In a recent study, we analyzed extracellular recordings in the primary visual cortex of 4 anesthetized macaques during the presentation of movie stimuli using a causality measure named Transfer Entropy. We found causal interactions between high frequency gamma rhythms (60-100Hz) recorded in different electrodes, involving in particular their phase, and between the gamma phase and spiking activity quantified by the instantaneous envelope of the MUA band (1-3kHz).
Here, we further investigate in the same dataset the meaning of these phase-MUA and phase-phase causal interactions by studying the distribution of phases at multiple recording sites at lags around the occurrence of spiking events.
First, we found a sharpening of the gamma phase distribution in one electrode when spikes are occurring in other recording site. This phenomena appeared as a form of phase-spike synchronization and was quantified by an information theoretic measure. We found this measure correlates significantly with phase-MUA causal interactions. Additionally, we quantified in a similar way the interplay between spiking and the phase difference between two recording sites (reflecting the well-know concept of phase synchronization). We found that, depending on the couple of recording site, spiking can correlate either with a phase synchronization or with a desynchronization with respect to the baseline. This effect correlates very well with the phase-phase causality measure.
These results provide evidence for high frequency phase-spike synchronization to reflect communication between distant neural populations in V1. Conversely, both phase synchronization or desynchronization may favor neural communication between recording sites. This new result, which contrasts with current hypothesis on the role of phase synchronization, could be interpreted as the presence of inhibitory interactions that are suppressed by desynchronization. Finally, our findings give new insights into the role of gamma rhythms in regulating local computation in the visual cortex.

manner. First the backbone resonances are assigned. This is usually achieved from sequential information provided by three chemical shifts: CA, CB and C’. Once the sequence is solved, the second assignment step takes place. For this purpose, the CA-CB and HA chemical shifts are used as a start point for assignment of the side chain resonances, thus connecting the backbone resonances to their respective side chains. This strategy is unfortunately limited by
the size of the protein due to increasing signal overlap and missing signals. Therefore, amino acid recognition is in many cases not possible as the CA-CB chemical shift pattern is not sufficient to discriminate between the 20 amino acids. As a result, the first step of the strategy
described above remains tedious and time consuming. The combination of modern NMR techniques with new spectrometers now provide information that was not always accessible
in the past, due to sensitivity problems. These experiments can be applied efficiently to measure a protein size up to 45 kDa and furthermore provide a unique combination of
sequential carbon spin system information. The assignment process can thus benefit from a maximum knowledge input, containing âallâ backbone and side chain chemical shifts as
well as an immediate amino acid recognition from the side chain spin system. We propose to extend the software PASTA (Protein ASsignment by Threshold Accepting) to achieve
a general sequential assignment of backbone and side-chain resonances in a semi- to fullautomatic per-residue approach. PASTA will offer the possibility to achieve the sequential assignment using any kind of chemical shifts (carbons and/or protons) that can provide sequential information combined with an amino acid recognition feature based on carbon spin system analysis.

Programming-by-demonstration promises to significantly reduce the burden of coding robots to perform new tasks. However, service robots will be presented with a variety of different situations that were not specifically
demonstrated to it. In such cases, the robot must autonomously generalize its learned motions to these new situations. We propose a system that can generalize movements to new target locations and even new objects. The former is achieved by using a task-specific coordinate system together with dynamical systems motor primitives. Generalizing actions to new
objects is a more complex problem, which we solve by treating it as a
continuum-armed bandits problem. Using the bandits framework, we can
efficiently optimize the learned action for a specific object. The proposed method was implemented on a real robot and succesfully adapted the grasping action to three different objects. Although we focus on grasping as an example of a task, the proposed methods are much more widely applicable to robot manipulation tasks.

The combination of PET and MRI is an emerging field of current research. It is known that the positron range is shortened in high magnetic fields (MF), leading to an improved resolution in PET images. Interestingly, only the fraction of positron range (PR) orthogonal to the MF is reduced and the fraction along the MF is not affected and yields to a non-isotropic count distribution. We measured the PR effect with PET isotopes like F-18, Cu-64, C-11, N-13 and Ga-68. A piece of paper (1 cm2) was soaked with each isotope and placed in the cFOV of a clinical 3T BrainPET/MR scanner. A polyethylene board (PE) was placed as a positron (β+) stopper with an axial distance of 3 cm from the soaked paper. The area under the peaks of one pixel wide profiles along the z-axis in coronal images was compared. Based on these measurements we confirmed our data in organic tissue. A larynx/trachea and lung of a butchered swine were injected with a mixture of NiSO4 for T1 MRI signals and Ga-68, simulating tumor lesions in the respiratory tract. The trachea/larynx were aligned in 35° to the MF lines and a small mass lesion was inserted to imitate a primary tracheal tumor whereas the larynx was injected submucosally in the lower medial part of the epiglottis. Reconstructed PET data show that the annihilated ratio of β+ at the origin position and in the PE depends on the isotope energy and the direction of the MF. The annihilation ratios of the source and PE are 52.4/47.6 (F-18), 57.5/42.5 (Cu-64), 43.7/56.7 (C-11), 31.1/68.9 (N-13) and 14.9/85.1 (Ga-68). In the swine larynx measurement, an artefact with approximately 39% of the lesion activity formed along MF lines 3cm away from the original injected position (fig.1). The data of the trachea showed two shine artefacts with a symmetric alignment along the MF lines. About 58% of the positrons annihilated at the lesion and 21% formed each artefact. The PR effects areminor in tissue of higher or equal density to water (0.096 cm-1). However, the effect is severe in low density tissue or air and might lead to misinterpretation of clinical data.

Policy search is a successful approach to reinforcement learning. However, policy
improvements often result in the loss of information. Hence, it has been marred by
premature convergence and implausible solutions. As first suggested in the context of
covariant policy gradients, many of these problems may be addressed by constraining
the information loss. In this book chapter, we continue this path of reasoning and suggest
the Relative Entropy Policy Search (REPS) method. The resulting method differs
significantly from previous policy gradient approaches and yields an exact update step.
It works well on typical reinforcement learning benchmark problems. We will also
present a real-world applications where a robot employs REPS to learn how to return balls in a game of table tennis.

Maximum entropy (MaxEnt) framework has been studied extensively in supervised
learning. Here, the goal is to find a distribution p that maximizes an entropy function
while enforcing data constraints so that the expected values of some (pre-defined) features
with respect to p match their empirical counterparts approximately. Using different
entropy measures, different model spaces for p and different approximation criteria
for the data constraints yields a family of discriminative supervised learning methods
(e.g., logistic regression, conditional random fields, least squares and boosting). This
framework is known as the generalized maximum entropy framework.
Semi-supervised learning (SSL) has emerged in the last decade as a promising field
that combines unlabeled data along with labeled data so as to increase the accuracy and
robustness of inference algorithms. However, most SSL algorithms to date have had
trade-offs, e.g., in terms of scalability or applicability to multi-categorical data. We
extend the generalized MaxEnt framework to develop a family of novel SSL algorithms.
Extensive empirical evaluation on benchmark data sets that are widely used in
the literature demonstrates the validity and competitiveness of the proposed algorithms.

Objectives: We study the quantitative effect of not accounting for the attenuation of patient positioning aids in combined PET/MR imaging.
Methods: Positioning aids cannot be detected with conventional MR sequences. We mimic this effect using PET/CT data (Biograph HiRez16) with the foams removed from CT images prior to using them for CT-AC. PET/CT data were acquired using standard parameters (phantoms/patients): 120/140 kVp, 30/250 mAs, 5 mm slices, OSEM (4i, 8s, 5 mm filter) following CT-AC. First, a uniform 68Ge-cylinder was positioned centrally in the PET/CT and fixed with a vacuum mattress (10 cm thick). Second, the same cylinder was placed in 3 positioning aids from the PET/MR (BrainPET-3T). Third, 5 head/neck patients who were fixed in a vacuum mattress were selected. In all 3 studies PET recon post CT-AC based on measured CT images was used as the reference (mCT-AC). The PET/MR set-up was mimicked by segmenting the foam inserts from the measured CT images and setting their voxel values to -1000 HU (air). PET images were reconstructed using CT-AC with the segmented CT images (sCT-AC). PET images with mCT- and sCT-AC were compared.
Results: sCT-AC underestimated PET voxel values in the phantom by 6.7% on average compared to mCT-AC with the vacuum mattress in place. 5% of the PET voxels were underestimated by >=10%. Not accounting for MR positioning aids during AC led to an underestimation of 2.8% following sCT-AC, with 5% of the PET voxels being underestimated by >=7% wrt mCT-AC. Preliminary evaluation of the patient data indicates a slightly higher bias from not accounting for patient positioning aids (mean: -9.1%, 5% percentile: -11.2%).
Conclusions: A considerable and regionally variable underestimation of the PET activity following AC is observed when positioning aids are not accounted for. This bias may become relevant in neurological activation or dementia studies with PET/MR

Brain-computer interfaces (BCIs) are limited in their applicability in everyday settings by the current necessity to record subject-specific calibration data prior to actual use of the BCI for communication. In this work, we utilize the framework of multitask learning to construct a BCI that can be used without any subject-specific calibration process, i.e., with zero training data. In BCIs based on EEG or MEG, the predictive function of a subject's intention is commonly modeled as a linear combination of some features derived from spatial and spectral recordings. The coefficients of this combination correspond to the importance of the features for predicting the intention of the subject. These coefficients are usually learned separately for each subject due to inter-subject variability. Principle feature characteristics, however, are known to remain invariant across subject. For example, it is well known that in motor imagery paradigms spectral power in the mu- and beta frequency ranges (roughly 8-14 Hz and 20-30 Hz, respectively) over sensorimotor areas provides most information on a subject's intention. Based on this assumption, we define the intention prediction function as a combination of subject-invariant and subject-specific models, and propose a machine learning method that infers these models jointly using data from multiple subjects. This framework leads to an out-of-the-box intention predictor, where the subject-invariant model can be employed immediately for a subject with no prior data. We present a computationally efficient method to further improve this BCI to incorporate subject-specific variations as such data becomes available. To overcome the problem of high dimensional feature spaces in this context, we further present a new method for finding the relevance of different recording channels according to actions performed by subjects. Usually, the BCI feature representation is a concatenation of spectral features extracted from different channels. This representation, however, is redundant, as recording channels at different spatial locations typically measure overlapping sources within the brain due to volume conduction. We address this problem by assuming that the relevance of different spectral bands is invariant across channels, while learning different weights for each recording electrode. This framework allows us to significantly reduce the feature space dimensionality without discarding potentially useful information. Furthermore, the resulting out-of-the-box BCI can be adapted to different experimental setups, for example EEG caps with different numbers of channels, as long as there exists a mapping across channels in different setups. We demonstrate the feasibility of our approach on a set of experimental EEG data recorded during a standard two-class motor imagery paradigm from a total of ten healthy subjects. Specifically, we show that satisfactory classification results can be achieved with zero training data, and that combining prior recordings with subject-specific calibration data substantially outperforms using subject-specific data only.

Background and Objective:
While machine learning approaches have led to tremendous advances in brain-computer interfaces (BCIs) in recent years (cf. [1]), there still exists a large variation in performance across subjects. Furthermore, a significant proportion of subjects appears incapable of achieving above chance-level classification accuracy [2], which to date includes all subjects in a completely locked-in state that have been trained in BCI control. Understanding the reasons for this variation in performance arguably constitutes one of the most fundamental open questions in research on BCIs.
Methods & Results
Using a machine learning approach, we derive a trial-wise measure of how well EEG recordings can be classified as either left- or right-hand motor imagery. Specifically, we train a support vector machine (SVM) on log-bandpower features (7-40 Hz) derived from EEG channels after spatial filtering with a surface Laplacian, and then compute the trial-wise distance of the output of the SVM from the separating hyperplane using a cross-validation procedure. We then correlate this trial-wise performance measure, computed on EEG recordings of ten healthy subjects, with log-bandpower in the gamma frequency range (55-85 Hz), and demonstrate that it is positively correlated with frontal- and occipital gamma-power and negatively correlated with centro-parietal gamma-power. This correlation is shown to be highly significant on the group level as well as in six out of ten subjects on the single-subject level. We then utilize the framework for causal inference developed by Pearl, Spirtes and others [3,4] to present evidence that gamma-power is not only correlated with BCI performance but does indeed exert a causal influence on it.
Discussion and Conclusions
Our results indicate that successful execution of motor imagery, and hence reliable communication by means of a BCI based on motor imagery, requires a volitional shift of gamma-power from centro-parietal to frontal and occipital regions. As such, our results provide the first non-trivial explanation for the variation in BCI performance across and within subjects. As this topographical alteration in gamma-power is likely to correspond to a specific attentional shift, we propose to provide subjects with feedback on their topographical distribution of gamma-power in order to establish the attentional state required for successful execution of motor imagery.

16th Conference of the International Linear Algebra Society (ILAS 2010), 16, pages: 19, June 2010, based on Joint work with Dongmin Kim and Inderjit Dhillon (poster)

Abstract

We study the fundamental problem of nonnegative least
squares. This problem was apparently introduced by Lawson
and Hanson [1] under the name NNLS. As is evident
from its name, NNLS seeks least-squares solutions that are
also nonnegative. Owing to its wide-applicability numerous
algorithms have been derived for NNLS, beginning from the
active-set approach of Lawson and Hanson [1] leading up to
the sophisticated interior-point method of Bellavia et al. [2].
We present a new algorithm for NNLS that combines projected
subgradients with the non-monotonic gradient descent
idea of Barzilai and Borwein [3]. Our resulting algorithm is
called BBSG, and we guarantee its convergence by exploiting
properties of NNLS in conjunction with projected subgradients.
BBSG is surprisingly simple and scales well to large
problems. We substantiate our claims by empirically evaluating
BBSG and comparing it with established convex solvers
and specialized NNLS algorithms. The numerical results suggest
that BBSG is a practical method for solving large-scale
NNLS problems.

Objectives: The study purpose is the evaluation of patients, suffering from hemato-oncological disease with complications at the lower extremities, using simultaneous PET/MRI.
Methods: Until now two patients (chronic active graft-versus-host-disease [GvHD], B-non Hodgkin lymphoma [B-NHL]) before and after therapy were examined in a 3-Tesla-BrainPET/MRI hybrid system following
F-18-FDG-PET/CT. Simultaneous static PET (1200 sec.) and MRI scans (T1WI, T2WI, post-CA) were acquired.
Results: Initial results show the feasibility of using hybrid PET/MRI-technology for musculoskeletal imaging of the lower extremities. Simultaneous PET and MRI could be acquired in diagnostic quality.
Before treatment our patient with GvHD had a high fascia and muscle FDG uptake, possibly due to muscle encasement. T2WI and post gadolinium T1WI revealed a fascial thickening and signs of inflammation.
After therapy with steroids followed by imatinib the patients symptoms improved while, the muscular FDG uptake droped whereas the MRI signal remained unchanged. We assume that fascial elasticity improved
during therapy despite persistance of fascial thickening. The examination of the second patient with B-NHL manifestation in the tibia showed a significant signal and uptake decrease in the bone marrow and
surrounding lesions in both, MRI and PET after therapy with rituximab. The lack of residual FDG-uptake proved superior to MRI information alone helping for exclusion of vital tumor.
Conclusions: Combined PET/MRI is a powerful tool to monitor diseases requiring high soft tissue contrast along with molecular information from the FDG uptake.

We present an efficient algorithm for large-scale non-negative least-squares
(NNLS). We solve NNLS by extending the unconstrained quadratic optimization
method of Barzilai and Borwein (BB) to handle nonnegativity constraints.
Our approach is simple yet efficient. It differs from other constrained BB variants
as: (i) it uses a specific subset of variables for computing BB steps; and
(ii) it scales these steps adaptively to ensure convergence. We compare our
method with both established convex solvers and specialized NNLS methods,
and observe highly competitive empirical performance.

We present a method for sparse regression problems. Our method is based on
the nonsmooth trust-region framework that minimizes a sum of smooth convex
functions and a nonsmooth convex regularizer. By employing a separable
quadratic approximation to the smooth part, the method enables the use of proximity
operators, which in turn allow tackling the nonsmooth part efficiently. We
illustrate our method by implementing it for three important sparse regression
problems. In experiments with synthetic and real-world large-scale data, our
method is seen to be competitive, robust, and scalable.

Workshop "Foundations and New Trends of PAC Bayesian Learning", 2010, March 2010 (poster)

Abstract

We applied PAC-Bayesian framework to derive gen-
eralization bounds for co-clustering1. The analysis
yielded regularization terms that were absent in the
preceding formulations of this task. The bounds sug-
gested that co-clustering should optimize a trade-off
between its empirical performance and the mutual in-
formation that the cluster variables preserve on row
and column indices. Proper regularization enabled
us to achieve state-of-the-art results in prediction of
the missing ratings in the MovieLens collaborative
filtering dataset.
In addition a PAC-Bayesian bound for discrete den-
sity estimation was derived. We have shown that
the PAC-Bayesian bound for classification is a spe-
cial case of the PAC-Bayesian bound for discrete den-
sity estimation. We further introduced combinatorial
priors to PAC-Bayesian analysis. The combinatorial
priors are more appropriate for discrete domains, as
opposed to Gaussian priors, the latter of which are
suitable for continuous domains. It was shown that
combinatorial priors lead to regularization terms in
the form of mutual information.

The pedestal effect is the large improvement in the detectabilty of a sinusoidal signal grating observed when the signal is added to a masking or pedestal grating of the same spatial frequency, orientation, and phase. We measured the pedestal effect in both broadband and notched noise - noise from which a 1.5-octave band centred on the signal frequency had been removed. Although the pedestal effect persists in broadband noise, it almost disappears in the notched noise. Furthermore, the pedestal effect is substantial when either high- or low-pass masking noise is used. We conclude that the pedestal effect in the absence of notched noise results principally from the use of information derived from channels with peak sensitivities at spatial frequencies different from that of the signal and pedestal. The spatial-frequency components of the notched noise above and below the spatial frequency of the signal and pedestal prevent the use of information about changes in contrast carried in channels tuned to spatial frequencies that are very much different from that of the signal and pedestal. Thus the pedestal or dipper effect measured without notched noise is not a characteristic of individual spatial-frequency tuned channels.

We present easy-to-use alternatives to the often-used two-stage Common Spatial Pattern + classifier approach for spatial filtering and classification of Event-Related Desychnronization signals in BCI. We report two algorithms that aim to optimize the spatial filters according to a criterion more directly related to the ability of the algorithms to generalize to unseen data. Both are based upon the idea of treating the spatial filter coefficients as hyperparameters of a kernel or covariance function. We then optimize these hyper-parameters directly along side the normal classifier parameters with respect to our chosen learning objective function. The two objectives considered are margin maximization as used in Support-Vector Machines and the evidence maximization framework used in Gaussian Processes. Our experiments assessed generalization error as a function of the number of training points used, on 9 BCI competition data sets and 5 offline motor imagery data sets measured in Tubingen. Both our approaches sho
w consistent improvements relative to the commonly used CSP+linear classifier combination. Strikingly, the improvement is most significant in the higher noise cases, when either few trails are used for training, or with the most poorly performing subjects. This a reversal of the usual "rich get richer" effect in the development of CSP extensions, which tend to perform best when the signal is strong enough to accurately find their additional parameters. This makes our approach particularly suitable for clinical application where high levels of noise are to be expected.

The human visual system samples images through saccadic eye movements which rapidly change the point of fixation. Although the selection of eye movement targets depends on numerous top-down mechanisms, a number of recent studies have shown that low-level image features such as local contrast or edges play an important role. These studies typically used predefined image features which were afterwards experimentally verified.
Here, we follow a complementary approach: instead of testing a set of candidate image features, we infer these hypotheses from the data, using methods from statistical learning. To this end, we train a non-linear classifier on fixated vs. randomly selected image patches without making any physiological assumptions. The resulting classifier can be essentially characterized by a nonlinear combination of two center-surround receptive fields. We find that the prediction performance of this simple model on our eye movement data is indistinguishable from the physiologically motivated model of Itti &amp; Koch (2000) which is far more complex. In particular, we obtain a comparable performance without using any multi-scale representations, long-range interactions or oriented image features.

Human observers are capable of detecting animals within novel natural scenes with remarkable speed and accuracy. Despite the seeming complexity of such decisions it has been hypothesized that a simple global image feature, the relative abundance of high spatial frequencies at certain orientations, could underly such fast image classification (A. Torralba & A. Oliva, Network: Comput. Neural Syst., 2003).
We successfully used linear discriminant analysis to classify a set of 11.000 images into “animal” and “non-animal” images based on their individual amplitude spectra only (Drewes, Wichmann, Gegenfurtner VSS 2005). We proceeded to sort the images based on the performance of our classifier, retaining only the best and worst classified 400 images (“best animals”, “best distractors” and “worst animals”, “worst distractors”).
We used a Go/No-go paradigm to evaluate human performance on this subset of our images. Both reaction time and proportion of correctly classified images showed a significant effect of classification difficulty. Images more easily classified by our algorithm were also classified faster and better by humans, as predicted by the Torralba & Oliva hypothesis.
We then equated the amplitude spectra of the 400 images, which, by design, reduced algorithmic performance to chance whereas human performance was only slightly reduced (cf. Wichmann, Rosas, Gegenfurtner, VSS 2005). Most importantly, the same images as before were still classified better and faster, suggesting that even in the original condition features other than specifics of the amplitude spectrum made particular images easy to classify, clearly at odds with the Torralba & Oliva hypothesis.

The pedestal or dipper effect is the large improvement in the detectabilty of a sinusoidal grating observed when the signal is added to a pedestal or masking grating having the signal‘s spatial frequency, orientation, and phase. The effect is largest with pedestal contrasts just above the ‘threshold‘ in the absence of a pedestal.
We measured the pedestal effect in both broadband and notched masking noise---noise from which a 1.5- octave band centered on the signal and pedestal frequency had been removed. The pedestal effect persists in broadband noise, but almost disappears with notched noise. The spatial-frequency components of the notched noise that lie above and below the spatial frequency of the signal and pedestal prevent the use of information about changes in contrast carried in channels tuned to spatial frequencies that are very much different from that of the signal and pedestal. We conclude that the pedestal effect in the absence of notched noise results principally from the use of information derived from channels with peak sensitivities at spatial frequencies that are different from that of the signal and pedestal. Thus the pedestal or dipper effect is not a characteristic of individual spatial-frequency tuned channels.

The pedestal or dipper effect is the large improvement in the detectability of a sinusoidal grating observed when the signal is added to a pedestal or masking grating having the signal‘s spatial frequency, orientation, and phase. The effect is largest with pedestal contrasts just above the threshold in the absence of a pedestal. We measured the pedestal effect in both broadband and notched masking noise---noise from which a 1.5-octave band centered on the signal and pedestal frequency had been removed. The pedestal effect persists in broadband noise, but almost disappears with notched noise. The spatial-frequency components of the notched noise that lie above and below the spatial frequency of the signal and pedestal prevent the use of information about changes in contrast carried in channels tuned to spatial frequencies that are very much different from that of the signal and pedestal. We conclude that the pedestal effect in the absence of notched noise results principally from the use of information derived from channels with peak sensitivities at spatial frequencies that are different from that of the signal and pedestal. Thus the pedestal or dipper effect is not a characteristic of individual spatial-frequency tuned channels.

Human observers are capable of detecting animals within novel natural scenes with remarkable speed and accuracy. Despite the seeming complexity of such decisions it has been hypothesized that a simple global image feature, the relative abundance of high spatial frequencies at certain orientations, could underly such fast image classification [1].
We successfully used linear discriminant analysis to classify a set of 11.000 images into animal and non-animal images based on their individual amplitude spectra only [2]. We proceeded to sort the images based on the performance of our classifier, retaining only the best and worst classified 400 images ("best animals", "best distractors" and "worst animals", "worst distractors").
We used a Go/No-go paradigm to evaluate human performance on this subset of our images. Both reaction time and proportion of correctly classified images showed a significant effect of classification difficulty. Images more easily classified by our algorithm were also classified faster and better by humans, as predicted by the Torralba & Oliva hypothesis.
We then equated the amplitude spectra of the 400 images, which, by design, reduced algorithmic performance to chance whereas human performance was only slightly reduced [3]. Most importantly, the same images as before were still classified better and faster, suggesting that even in the original condition features other than specifics of the amplitude spectrum made particular images easy to classify, clearly at odds with the Torralba & Oliva hypothesis.

The performance of unsupervised learning models for natural images is evaluated quantitatively by means of information theory. We estimate the gain in statistical independence (the multi-information reduction) achieved with independent component analysis (ICA), principal component analysis (PCA), zero-phase whitening, and predictive coding. Predictive coding is translated into the transform coding framework, where it can be characterized by the constraint of a triangular filter matrix. A randomly sampled whitening basis and the Haar wavelet are included into the comparison as well. The comparison of all these methods is carried out for different patch sizes, ranging from 2x2 to 16x16 pixels. In spite of large differences in the shape of the basis functions, we find only small differences in the multi-information between all decorrelation transforms (5% or less) for all patch sizes. Among the second-order methods, PCA is optimal for small patch sizes and predictive coding performs best for large patch sizes. The extra gain achieved with ICA is always less than 2%. In conclusion, the `edge filters‘ found with ICA lead only to a surprisingly small improvement in terms of its actual objective.

Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems