Looks approximately like SVM: perform binary classification on a high-dimensional manifold (or sets of manifolds in this case).

The general idea behind Mcp_simple is to start with a finite number of training examples, find the maximum margin solution for that training set, augment the draining set by finiding a poing on the manifolds that violates the constraints, iterating the process until a tolerance criteria is met.

The more complicated cutting plane SVM uses slack variables to allow solution where classification is not linearly separable.

Propose using one slack variable per manifold, plus a manifold center, which strictly obeys the margin (classification) constraint.

Much effort put to proving the convergence properties of these algorithms; admittedly I couldn't be bothered to read...

This has been my intuition for a while; you can learn abstract rules via active probing of the environment. This paper supports such intuitions with extensive scholarship.

“The basic theme of this article is that one can cast learning, inference, and decision making as processes that resolve uncertanty about the world.

References Schmidhuber 1991

“A learner should choose a policy that also maximizes the learner’s predictive power. This makes the world both interesting and exploitable.” (Still and Precup 2012)

“Our approach rests on the free energy principle, which asserts that any sentient creature must minimize the entropy of its sensory exchanges with the world.” Ok, that might be generalizing things too far..

Levels of uncertainty:

Perceptual inference, the causes of sensory outcomes under a particular policy

Uncertainty about policies or about future states of the world, outcomes, and the probabilistic contingencies that bind them.

For the last element (probabilistic contingencies between the world and outcomes), they employ Bayesian model selection / Bayesian model reduction

Can occur not only on the data, but exclusively on the initial model itself.

“We use simulations of abstract rule learning to show that context-sensitive contingiencies, which are manifest in a high-dimensional space of latent or hidden states, can be learned with straightforward variational principles (ie. minimization of free energy).

Assume that initial states and state transitions are known.

Perception or inference about hidden states (i.e. state estimation) corresponds to inverting a generative model gievn a sequence of outcomes, while learning involves updating the parameters of the model.

The actual task is quite simple: central fixation leads to a color cue. The cue + peripheral color determines either which way to saccade.

Gestalt: Good intuitions, but I’m left with the impression that the authors overexplain and / or make the description more complicated that it need be.

The actual number of parameters to to be inferred is rather small -- 3 states in 4 (?) dimensions, and these parameters are not hard to learn by minimizing the variational free energy:

F=D[Q(x)&VerticalBar;&VerticalBar;P(x)]−Eq[ln(P(ot&VerticalBar;x)] where D is the Kullback-Leibler divergence.

Mean field approximation: Q(x) is fully factored (not here). many more notes

Recorded an stimulated the LMAN (upstream, modulatory) region of the zebrafinch song-production & learning pathway.

Found evidence, albeit weak, for a mirror arrangement or 'causal inverse' there: neurons fire bursts prior syllable production with some motor delay, ~30ms, and also fire single spikes with a delay ~10 ms to the same syllables.

This leads to an overall 'mirroring offset' of about 40 ms, which is sufficiently supported by the data.

The mirroring offset is quantified by looking at the cross-covariance of audio-synchronized motor and sensory firing rates.

Causal inverse: a sensory target input generates a motor activity pattern required to cause, or generate that same sensory target.

Similar to the idea of temporal inversion via memory.

Data is interesting, but not super strong; per the discussion, the authors were going for a much broader theory:

Normal Hebbian learning says that if a presynaptic neuron fires before a postsynaptic neuron, then the synapse is potentiated.

However, there is another side of the coin: if the presynaptic neuron fires after the postsynaptic neuron, the synapse can be similarly strengthened, permitting the learning of inverse models.

"This order allows sensory feedback arriving at motor neurons to be associated with past postsynaptic patterns of motor activity that could have caused this sensory feedback. " So: stimulate the sensory neuron (here hypothetically in LMAN) to get motor output; motor output is indexed in the sensory space.

In mammals, a similar rule has been found to describe synaptic connections from the cortex to the basal ganglia [37].

... or, based on anatomy, a causal inverse could be connected to a dopaminergic VTA, thereby linking with reinforcement learning theories.

Simple reinforcement learning strategies can be enhanced with inverse models as a means to solve the structural credit assignment problem [49].

Need to review literature here, see how well these theories of cortical-> BG synapse match the data.

Fitness prediction is a technique to replace fitness evaluation in evolutionary algorithms with a light-weight approximation that adapts with the solution population.

Cannot approximate the full landscape, but shift focus during evolution.

Aka local caching.

Or adversarial techniques.

Instead use coevolution, with three populations:

1) solutions to the original problem, evaluated using only fitness predictors;

2) fitness predictors of the problem; and

3) fitness trainers, whose exact fitness is used to train predictors.

Trainers are selected high variance solutions across the predictors, and predictors are trained on this subset.

Lightweight fitness predictors evolve faster than the solution population, so they cap the computational effort on that at 5% overall effort.

These fitness predictors are basically an array of integers which index the full training set -- very simple and linear. Maybe boring, but the simplest solution that works ...

They only sample 8 training examples for even complex 30-node solution functions (!!).

I guess, because the information introduced into the solution set is relatively small per generation, it makes little sense to over-sample or over-specify this; all that matters is that, on average, it's directionally correct and unbiased.

Used deterministic crowding selection as the evolutionary algorithm.

Similar individuals have to compete in tournaments for space.

Showed that the coevolution algorithm is capable of inferring even highly complex many-term functions

Once the probes were fully advanced into the brain, we observed a decline in the compression force over time.

However, the compression force never decreased to zero.

This may indicate that chronically implanted probes experience a constant compression force when inserted in the brain, which may push the probe out of the brain over time if there is nothing to keep it in a fixed position.

Yet ... the Utah probe seems fine, up to many months in humans.

This may be a drawback for flexible probes [24], [25]. The approach to reduce tissue damage by reducing micromotion by not tethering the probe to the skull can also have this disadvantage [26]. Furthermore, the upward movement may lead to the inability of the contacts to record signals from the same neurons over long periods of time.

We did not observe a difference in initial insertion force, amount of dimpling, or the rest force after a 3-min rest period, but the force at the end of the insertion was significantly higher when inserting at 100 μm/s compared to 10 μm/s.

No significant difference in histological response observed between the two speeds.

Tissue damage, evaluated as the size of the hole left by the needle after retraction, bleeding, and tissue fracturing, was found to increase for increasing insertion speeds and was higher within white matter regions.

A statistically significant difference in hole areas with respect to insertion speed was found.

While there are no previous needle insertion speed studies with which to directly compare, previous electrode insertion studies have noted greater brain surface dimpling and insertion forces with increasing insertion speed [43–45]. These higher deformation and force measures may indicate greater brain tissue damage which is in agreement with the present study.

There are also studies which have found that fast insertion of sharp tip electrodes produced less blood vessel rupture and bleeding [28,29].

These differences in rate dependent damage may be due to differences in tip geometry (diameter and tip) or tissue region, since these electrode studies focus mainly on the cortex [28,29].

In the present study, hole measurements were small in the cortex, and no substantial bleeding was observed in the cortex except when it was produced during dura mater removal.

Any hemorrhage was observed primarily in white matter regions of the external capsule and the CPu.

Rapid deformation results in greater pressurization of fluid filled spaces if fluid does not have time to redistribute, making the tissue effectively stiffer. This may occur in compacted tissues below or surrounding the needle and result in increasing needle forces with increasing needle speed.

(Interesting): eight identical electrode arrays implanted into the same region of different animals have shown that half the arrays continue to record neural signals for >14 weeks while in the other half of the arrays, single-unit yield rapidly degraded and ultimately failed over the same timescale.

In another study, aimed at uncovering the time course of insertion-related bleeding and coagulation, electrodes were implanted into the cortex of rats at varying time intervals (−120, −90, −60, −30, −15, and 0 min) using a micromanipulator and linear motor with an insertion speed of 2 mm/s.40 The results showed dramatic variability in BBB leakage that washed out any trend (Figure 3), suggesting that a separate underlying cause was responsible for the large inter- and intra-animal variability.

Somatostatin, a neuropeptide, is of ill-defined role. Unknown when it is released.

SST interneurons receive diffuse input from cortical pyramidal cells, but each synapse is of low strength.

SST intererneurons are frequently electrically connected through gap junctions, but almost never through electrical synapses. The resulting network can extend for hundreds of microns, and has been shown to cause synchronized firing when cells are active.