In terms of considering using rapid switching as the basis of selective
attention, I did do some work in developing control algorithms based on
this principle. The basic idea was to switch between negative feedback
and positive feedback, depending upon some control criteria. Without
getting into details, the dynamic response of a simple controller was
most interesting. First of all, when the controller achieved zero error,
the switching rate tended towards the maximum limit (infinity in
theory). Second, the response to an error, showed a trajectory where
there was no overshoot as normally occurring using algorithms based on
linear theory. In addition, the response provided a substantially
smaller integrated absolute error as compared to an equivalent
controller based on linear theory. A casual observation of our motor
movement supports this response in that no overshoot is usually observed.

In the controller design, the control criteria for switching was based
on both the error and its rate of change, basically bringing us into the
phase domain of analysis rather than methods based on Fourier. It is
permissible to use the auditory stimulus as an error, leaving us with
the question as to what the equivalent control criteria should be to
perform some kind of frequency analysis rather than control, as there
seems to be no getting away from this fundamental requirement.
Determining F0 in the case of a single source or multiple F0's in case
of multiple sources on a continuous basis (the F0's could be changing)
becomes critical for source segregation, leaving for us to chose the
basis by which a particular source is selected.

Ignoring for the moment how general arousal changes (as while sleeping),
it seems that selection of the source depends critically upon the rate
of change, whether it be frequency, energy level or results from
binaural processing. If rate of change is used as the basis of selection
rather than "level" as used in your paper, a clearer criteria becomes
available to model such behavior and it could be applied to every level
of hierarchical structures. It also becomes easier to explain the need
for hierarchical models of the neural system.

All this is of course contingent on selecting the appropriate control
criteria and I have chosen evaluative bivalence, a concept well
described in the field of psychology, but used by me for signal
processing. I have been documenting my progress at my website
www.tonepitch.com and hope to update it soon with some results showing
how speaker independent vowel identification can take place. At the
moment, spectral centroid provides no relevant information other than
speaker dependency.

I am sorry to have made this email so long and hope that it is relevant
to the present discussion.
Cheers to all,

Randy Randhawa

Diana Deutsch wrote:

Dear Dan, John et al.,

To place this discussion in historical perspective, the 'late
selection' model of attention was first proposed by Deutsch, J.A. and
Deutsch, D.,'Attention: Some theoretical considerations',/
Psychological Review/, 1963, 70, 80-90. There have, of course, been a
large number of elaborations of this basic model. The article is attached.

Cheers,

Diana Deutsch

This note was intended to be sent to the entire list (my change of
e-mail address prevented it from appearing); Dan has replied to me
privately.

Dan,

What you've suggested so succinctly seems a lot like the concepts on
perception that I discuss on my Web site. Using a basic system
analysis approach I've tried to find fundamental principles from
which to establish generalized requirements for sensory systems based
on what it takes for animals to survive in their environments.

Here is some of my thinking: Survival requires an animal's sensory
organs to produce a timely response to environmental information.
These responses must identify the relative importance of sources in a
way that describes a "situation" whereby "awareness" characterizes
the auditory scene. In any situation, sensors should be able to
select for attention instantaneously the single most important source
within the scene. To do this the model I propose has a hierarchy of
perceptual levels each of which has its own ability for awareness and
attention but is subservient to judgments by higher authority. Each
level's responses are based upon specific time frames within which
information from each level can be made available. (Simple meanings
occupy less time to absorb than complicated meanings,) In addition,
each level is capable of selecting a priority source within its
domain for attention. For example, the lowest levels which have the
simplest information could respond reflexively to a source within
milliseconds. But its response could be moderated by a top-down
decision from higher levels. It is thus possible, as you suggest,
that what seems like simultaneous multiple-source attention at the
conscious level is actually the rapid switching of priorities among
source objects based on subliminal decisions at the lower levels. As
an example, consider the complexity of the attention decisions a
quarterback must process within the few seconds he has during a
football play.

A crucial question: how to achieve and synchronize timely responses
at all levels. It appears to me that this problem has not been
seriously addressed in the current paradigm.

I think another factor to consider in the concurrent segregation of
sounds is that much of the segregation may be accomplished
pre-attentively. Low-level (in the brain, and even cochlea) feature
detectors may segregate aspects of sounds, if not the sounds
themselves, well before they percolate up to what we call
"consciousness." This would depend on the type of segregating cue
under consideration, whether it is pitch, spatial location, onset
time, etc.