\documentstyle[11pt,graphics,doublespace]{article}
%\documentstyle[11pt,graphics]{article}
\begin{document}
{\center {\Large {\bf Information Retrieval and Categorisation using a Cell Assembly Network\\}}}
{\center Christian R. Huyck and\\}
{\center Viviane Orengo\\}
{\center Middlesex University\\}
{\center c.huyck@mdx.ac.uk\\}
\begin {abstract}
Simulated networks of spiking leaky integrators are used to categorise
and for Information Retrieval (IR). Neurons in the network are
sparsely connected, learn using Hebbian learning rules, and are
simulated in discrete time steps. The results of the system show that
congresspeople are correctly categorised 89\% of the time. The IR
systems have 40\% average precision on the Time collection, and 28\%
on the Cranfield 1400. All scores are comparable to the state of the
art results on these tasks.
\end {abstract}
\section {Introduction}
For the past several years we have been developing neural models to
simulate mammalian neural processing \cite {Huyck1,Huyck2,Ivancich}.
In this paper we apply these neural models to two real world Data
Mining problems.
Hebb proposed Cell Assemblies (CAs) as the basis of human concepts
\cite {Hebb} over 50 years ago. CAs are reverberating circuits of
neurons that can persist long beyond the initial stimulus has ceased.
There is a wide range of agreement that CAs do form the basis of human
concepts, and solve a wide range of other problems \cite
{Braitenberg,Sakurai}. Our earlier experiments have been successful
in simulating a range of behaviour CAs are responsible for.
Human concepts are complex, so we should be able to use our CA
networks for other simpler tasks. In doing so we no longer need to
adhere to all of the biological constraints.
Two difficult tasks are categorisation and Information Retrieval (IR).
By using a simple version of our network we can categorise
congressional representative as either republicans or democrats based
on their voting record.
Our model can also be used for IR. By specifying the network topology
using a series of texts, we can learn relationships between words.
This network of relationships can then be used to retrieve documents
based on queries and the running of our network.
In the next section we describe our neural network architecture and
the learning rules that we use. In the third section, we describe our
categorisation simulation. The fourth section describes our IR
experiments. The fifth section is a discussion of how the two
experiments influence and are influenced by CA issues. The concluding
section points out some current limitations of the system but also
points out how a wide range of problems can currently be solved using
this architecture.
\section {Cell Assembly Networks}
The basis of our computational model is mammalian neural processing.
The neurons are modelled by spiking fatiguing leaky integrators in
discrete time steps. Neurons are connected via uni-directional
synapses, and learning occurs by synaptic modification. We use
unsupervised learning rules to change synapses, and the changes are
based solely on the properties of the pre and post-synaptic neurons.
Processing is broken into discrete time steps. On each time step, any
neuron that should fire does fire, and all activation levels are
updated. Neurons learn using Hebbian learning rules. Neurons are
connected by a small number of synapses per neuron and also in a
biologically plausible manner. Below we explain neurons, learning,
and the network's topology more thoroughly.
\subsection {Neurons}
A neuron fires if it has enough activation to surpass the activation
threshold; if a neuron fires it loses all of its activation. When a
neuron fires, it sends activation through its synapses to other
neurons. The energy that is sent is equivalent to the weight of the
synapse. This energy is not counted until the next time step. If a
neuron does not fire, some of its activation leaks away.
The activation of a given neuron i at time t is:
$$h_{i_t} = {{h_{i_{t-1}}}\over l} + \sum_{j \in V} w_{ji} $$
\centerline{\bf{Equation 1.}} \\ The current activation is the
activation from the last time step divided by a leak factor {\it l}
plus the new activation coming in. This new activation is the sum of
the inputs of all neurons $ j \in V$, weighted by the value of the
synapse from neuron {\it j} to neuron {\it i}; $V$ being the set of
all neurons that are connected to {\it i} that fired in time step
$t-1$. However, if the neuron fired in the prior time step, no
activation carries over from $t-1$ ({\it l} is infinite).
Neurons, like muscles, also fatigue. So a neuron should not be able
to fire over and over again. This is modelled by raising the
activation threshold each time a neuron fires, and reducing the
threshold, to at most the base level, when a neuron does not fire.
\subsection {Hebbian Learning}
\label {sec-HebbLearn}
One of the strengths of this model is that it uses a simple form of
unsupervised learning: Hebbian learning. Hebbian learning increases
the weight of the synapse when both pre and post-synaptic neurons
fire. To prevent saturation of the synapse, the weight is decreased
when one of the neurons fire and the other does not. The decreasing
rule is called an anti-Hebbian rule. However, there are a range of
Hebbian learning rules that fit within this definition.
The rules implied by Hebb are correlatory rules. That is, the synaptic
weight represents the correlation of the two neurons.
Most of our prior work has dealt with pre-not-post forgetting rules.
That is, if the pre-synaptic neuron fires, and the post-synaptic
neuron does not, then reduce the strength. Our simple correlatory
learning rule combines a learning rule with the anti-Hebbian rule and
makes the synaptic weight approximate the likelihood that the
post-synaptic neuron fires when the pre-synaptic neuron fires \cite
{Huyck1}.
In this paper, we have also used a post-not-pre anti-Hebbian rule
that reduces the synaptic weight when the post-synaptic neuron fired
and the pre-synaptic neuron did not. A post-not-pre correlatory
anti-Hebbian rule makes the synapse represent the likelihood that the
pre-synaptic neuron fires when the post-synaptic neuron fires. We use
this in the IR experiments (see section \ref {sec-IR}) for reasons
explained there.
We also make use of a compensatory learning rule as opposed to
a correlatory rule. Correlatory rules reflect the likelihood that
the neurons co-fire. Compensatory rules are a modification of the
correlatory rule that limits the total synaptic strength of a neuron.
This gives neurons with low correlation more influence, and reduces
the influence of highly correlated neurons. These rules are more thoroughly
described in \cite {Huyck2}.
\subsection {The Network}
In our biologically inspired topologies, we use distance biased connections.
This implies that a neuron is likely to have a connection to nearby neurons,
but unlikely to have connections to any given distant neuron. In the
brain this is useful because it reduces the amount of axonal cable that
is needed.
For both the categorisation and IR tasks, we have entirely ignored this
distance biased constraint. For the categorisation task we have used
random connections, and for the IR task we have used co-occurrence in the
documents as a basis of connectivity. We did not use distance-biased
connections, because we were not particularly interested in forming
stable states, but in the short term dynamics.
The brain also has inhibitory neurons and our normal model makes
extensive use of these. However, in these experiments we are using
only excitatory neurons. We did not use inhibitory neurons because we
were only interested in behaviour after a few cycles; inhibitory
neurons allow CAs to compete, but this takes dozens of cycles.
Our prior work has dealt with CAs and the power of these networks come from neurons
that learn relations based on environmental stimulus. This work models
mammalian brains. However, mammalian brains have layers of cortex that
translate sensory data into internal features. In these experiments
we have skipped these layers of cortex and directly encoded
environmental input into the net. This allows us to directly exploit
the co-occurrence of features.
We are not using CAs in this work. We are using the early stages of
CAs to map co-occurrence of features, and this is a simple but powerful
mechanism.
\section {Categorisation}
\label {sec-cat}
We used a readily available categorisation problem \cite {Blake}, the
1984 U.S. Congressional Voting Records database. This consisted of
435 records including each congressperson's party affiliation and
voting record on 16 bills. All 17 fields were binary (yes or no, or
Republican or Democrat), and the voting fields could be omitted.
We needed a mechanism to present the data to our network. This was
done by making a 17x20 network of neurons. We used only excitatory
neurons and each neuron was randomly connected via synapses to 40
other neurons. Each of the 17 rows represented one feature, 10
neurons for a yes vote or Republican, and 10 neurons for a no vote or
Democrat. In training mode the appropriate neurons were activated.
So if someone voted for on all 16 bills, 170 neurons would be
fired. If someone did not vote on a particular bill, those
neurons were not fired. An example network is shown in figure 1.
\begin{figure}
\resizebox{\textwidth}{4.5 in}{\includegraphics{cvshot3.ps}}
\centerline{Figure 1: Example Network Activation}
\end{figure}
In this example, the first row has the left 10 neurons firing, so it
represents a {\it Republican}. The second row has the right 10 neurons
firing, so this congressperson voted no on the first bill; the third row
is empty meaning he did not vote on the second bill; and the fourth
row has the left 10 neurons firing, so the congressperson voted yes on
this bill. In all he voted for 15 of the 16 bills and with his party
affiliation, this leads to 160 neurons firing.
In the testing mode, the votes would be presented by activating the
appropriate neurons, but the first row would be omitted. The system
would then run, and activate {\it Democratic} and {\it Republican}
neurons. After 5 cycles of presentation and spread of activation, the
number of neurons were totalled. If more {\it Republican} neurons
were active, the congressperson was categorised as a Republican. If
more {\it Democratic} neurons were active, or an equal number, the
congressperson was categorised as a Democrat.
We trained the system by presenting each of the records to the network
for one cycle. Activation was then erased and the next record
presented. We did this for each of the training records, then repeated
the process 5 times, for a total of 1740 cycles.
We used the correlatory learning rule, so synaptic strength should have
been roughly equal to the likelihood that the post-synaptic neuron
fired when the pre-synaptic neuron fired.
Though the same Hebbian algorithm is used, this is a type of
semi-supervised learning. The answers are presented, but they are
presented as any other feature, not as a special answer feature.
%consider I think there is another name for this type of learning. It's
%not reinforcement learning but something else.
We performed a standard five-fold analysis on the 435 voting records. This
required five networks to be learned. The results are shown in Table 1.
A given network was trained on 348 records, and tested on 87. The table
shows that the average performance is 89\%.
\begin {table} [ht]
\begin{center}
\begin {tabular}{c | c | c}
& Standard & Reverse \\
\hline
Network 1 & .93 & .88 \\
Network 2 & .90 & .75 \\
Network 3 & .84 & .90 \\
Network 4 & .91 & .88 \\
Network 5 & .87 & .91 \\
Average & .89 & .86 \\
\end {tabular}
\caption {Performance on Categorisation Task}
\end {center}
\end {table}
It was also a simple thing to train the network on 87 records and test
on the remaining 348, a reverse five-fold analysis. The results are
also shown in Table 1. Somewhat surprisingly the results are almost
as good in this test. Clearly, our network can generalise from a
small set of data at least for this task.
Prior work on this indicates that the best possible results for this task
are between 90 and 95\% \cite {Schlimmer}. Consequently, this method
performs near the best possible levels.
Our network handles missing data seamlessly. If a congressperson does
not vote on a bill, that row is simply not activated in training or in
testing.
Similarly, the same mechanism could be used to see how a certain
person would vote on a particular bill. Instead of omitting the test
case's party affiliation, the vote on the bill could be omitted, and
the network could predict that congressperson's vote.
Another nice property of this network is that it gives you an idea of
confidence of the decision. For instance, the system may have 10 {\it
Republican} neurons and no {Democratic} neurons for a particular test,
an one {\it Republican} and no {\it Democratic} neurons for another.
Clearly the system is more confident in the first decision. A simple
measure of this confidence is the absolute value of the difference of
{\it Republican} and {\it Democratic} neurons. When the system is
highly confident, it is almost always right. Using the standard
five-fold test, 200 cases have a difference of 9 neurons, and only 2
are misclassified. Table 2 shows a more complete breakdown by
confidence. The columns refer to the absolute difference between {\it
Republican} and {\it Democratic} neurons. The running percentage row
refers to the percentage correct if the system guessed at this level
and above while the absolute percentage row refers to the percentage
correct if it guessed only at this level.
\begin {table} [ht]
\begin{center}
\begin {tabular}{c | c | c | c | c | c | c | c | c | c | c | c }
Difference & 10 & 9 & 8 & 7 & 6 & 5 & 4 & 3 & 2 & 1 & 0 \\
\hline
Running \% & .99 & .99 & .99 & .98 & .97 & .95 & .95 & .93 & .92 & .89 & .89\\
Absolute \% & .99 & .99 & .97 & .92 & .88 & .71 & .73 & .58 & .54 & .31 & .67\\
\end {tabular}
\caption {Performance Based on Confidence}
\end {center}
\end {table}
\section {Information Retrieval}
\label {sec-IR}
The goal of IR is to match user queries to documents stored in the
database. A typical IR system comprises: stop word removal, stemming,
weighting, indexing, and ranking algorithms. This section explains
how we have used CA networks as an IR system. We describe the
method, report on experiments carried out, and the results achieved.
\subsection {Basic Structure of Retrieval}
Documents are composed of terms. In our model, each term is
represented by a neuron. The network is presented with documents by
activating the neurons that represent the words in the documents. As
the documents are presented, the weights of the synapses between
neurons are adjusted. This leads to weights that reflect co-occurrence
of words in documents.
Once the learning phase is concluded, the network is presented
with queries. The queries consist of a set of terms that describe an
information need. The neurons corresponding to those terms are
stimulated and send activation to other neurons causing them to fire.
This leads to a cascade of neural firing increasing with each time step.
Documents that have similar neural activation patterns are retrieved.
One of the most widely applied techniques to improve recall in an IR
system is query expansion. The basic idea is to accommodate term usage
variations by expanding the terms in the query using related
terms. Normally, this is done using a thesaurus that adds synonyms,
broader terms and related words. However, manually constructed
thesauri are not always available, therefore research in new methods
for automatic query expansion has been quite popular. Some proposed
approaches include: term clustering \cite {Lesk}, automatic thesaurus
construction \cite {Crouch}, and Latent Semantic Indexing \cite
{Deerwester}.
Our CA network provides embedded query expansion, since when we
stimulate query terms, they send activation to related terms,
automatically expanding the query. As the query expands it becomes
similar to documents. This, in effect, is search only by query
expansion, followed by a direct similarity measurement between the
query and documents.
The similarity between queries and documents is calculated using
Pearson's Correlation \cite {Hetherington} , which varies between -1
and 1. A correlation of 1 denotes that the patterns are identical and
a correlation of -1 indicates that the patterns do not share any
features. We compare the state of the network at some point after the
query has been presented against all documents and then rank them in
decreasing order of correlation.
The standard IR evaluation involves two measurements: recall and
precision. Recall measures the proportion of relevant documents that
have been retrieved and precision measures the proportion of retrieved
documents that are relevant. In order to calculate those figures, we
need a test collection containing documents, queries, and relevance
assessments that tell for each query which documents should be
retrieved. The typical evaluation calculates precision at 11 recall
points (0\%, 10\%, ... 100\%).
\subsection {Problem of Under and Over Similarity}
One problem for our system is that any given word pair is not very
likely to occur in any document. Consequently, our standard method of
randomly connecting neurons would lead to the vast majority of
synapses having no weight after training. During a query, these small
weights would lead to no neurons being activated aside from the query
neurons. To solve this under similarity problem, we fixed the
connectivity of the network. Neurons were connected only to neurons
when the words they represented co-occurred in some documents.
However, our initial attempts led to an over similarity problem. That
is, each query presented led to the same neurons being activated.
This was due to frequent words having high connection weights which
caused them to be activated on every query. This relates to the Term
Frequency Inverse Document Frequency (TF-IDF) problem explained more
thoroughly in section \ref {sec-discCA}. In order to solve this
problem, we used three mechanisms that rewarded frequent words less.
Firstly, we tested a different topology; secondly, we tried a
pre-not-post learning rule; and thirdly, we used a compensatory
pre-not-post learning rule.
When we solved the under similarity problem, we connected each neuron
to the 40 neurons that represented the words that co-occurred most
frequently with the word the neuron represented. This made neurons
that represented the most frequent words have more connections to
them. Consequently, when we tested, those most frequent words always
became active. A different solution was to randomly choose among
the words that co-occurred, thus rewarding frequent words less.
A second mechanism we used was to change our learning rule. We used a
post-not-pre anti-Hebbian rule instead of our typical pre-not-post
rule (see section \ref {sec-HebbLearn}). This tended to reduce the
weight of synapses going to frequent words.
The third mechanism was to use compensatory learning. Here the learning
rule uses total weight coming into a neuron, and tries to bring the total
weight toward a constant. So, neurons with low correlations to them have
the synaptic weights increased, and neurons with high correlations to them
have the synaptic weights decreased.
All three of these mechanisms have the effect of making frequent words
less important, and infrequent words more important. This is
desirable because words that occur few times within a collection are
more distinctive and therefore are more important.
\subsection {Results}
We carried out experiments using two standard IR collections: the Time
Magazine collection, containing news articles from 1963; and Cranfield 1400,
containing abstracts in aeronautics. We used the Porter stemmer
\cite {Porter} to remove word suffixes, and we also removed stop words
according to the list provided by SMART \cite {Smart}. Table 3 shows some
characteristics of the collections used:
\begin {table} [ht]
\begin{center}
\begin {tabular}{|c | c | c |}
\hline
&Time Magazine&Cranfield 1400\\
\hline
Number of documents & 425 & 1400\\
Number of queries & 83 & 225\\
Number of terms & 7596 & 2629 \\
\hline
\end {tabular}
\caption {Characteristics of Test Collections}
\end {center}
\end {table}
Each term occurring in more than one document was assigned a
neuron. Each neuron has connections to 40 other neurons. We have tried
two types of networks - (i) connecting each neuron with the 40 neurons
it co-occurs most and (ii) connecting each neuron with any 40 neurons
it co-occurs with. We also tested correlatory versus compensatory
learning rules.
During the learning phase, each document was presented to the network
for 1 cycle. This was repeated 20 times. At the end of this process
the synaptic weights have been set. The queries were then presented,
we allowed the activation to spread for 5 cycles, and then saved the
status of the network. The status corresponding to each query was
compared to the activation pattern of each document using Pearson's
correlation.
Table 4 shows a summary of our results in terms of average
precision. Compensatory learning was superior to correlatory
learning; compensatory learning was superior on both sorted and
random topologies. This happened because compensatory learning
provides a mechanism for dealing with common words, which has a
similar effect to TF-IDF.
We also observed that the random network was better than the sorted
topology. This is evident in the peformance of both topologies using
both compensatory and correlatory learning. In the sorted topology,
again common words benefited, being activated by most queries.
\begin {table} [ht]
\begin{center}
\begin {tabular}{|c | c | c |}
\hline
Run & Time & Cranfield\\
\hline
Correlatory - Random & 0.3361 & 0.1715\\
Compensatory - Random & 0.4021 & 0.2812 \\
Correlatory - Sorted & 0.1930 & 0.0138\\
Compensatory - Sorted & 0.3583 & 0.1267 \\
LSI (TF-IDF) & 0.3038 & 0.1708 \\
\hline
\end {tabular}
\caption {Performance in Average Precision}
\end {center}
\end {table}
Compensatory learning provided more improvement thand the random
topology and thus was a more important factor. However, the
mechanisms can be combined, and the best system combined compensatory
learning with the random topology.
We have also compared post-not-pre and pre-not-post learning. The
later being more efficient. For the Time collection our best result
with post-not-pre used a random topology with compensatory learning and
had a 20.97\% average precision (not shown in the tables or figures). The
similar pre-not-post achieved 40.21\%. In pre-not-post learning, we control the
amount of activation going into the neurons as opposed to restraining
the amount of activation they emit. This reduces input to common words.
\begin{figure}
\begin {center}
\resizebox{\textwidth}{3 in}{\includegraphics{rptime2.eps}}
\centerline{Figure 2: Recall-Precision Curves for Time Collection}
\end {center}
\end{figure}
\begin{figure}
\begin {center}
\resizebox{\textwidth}{3 in}{\includegraphics{rpcran2.eps}}
\centerline{Figure 3: Recall-Precision Curves for Cranfield Collection}
\end {center}
\end{figure}
Figures 2 and 3 show Recall-Precision curves for our trials. We
compared them against a well known technique - Latent Semantic
Indexing (LSI) \cite {Deerwester}. LSI is also based on term
co-occurrences. The process involves applying a factor analysis to a
term-by-document matrix. Our results were superior to LSI's using
the TF-IDF weighting scheme.
\section {Discussion}
The previous two sections have shown that CA networks can be used for
real world tasks, but what implications does this work have? Firstly,
how does it reflect on CAs and mammalian neural processing? Secondly,
how might CA networks be extended to become better categorisers and
better IR systems?
\subsection {Implications from CA networks}
\label {sec-discCA}
On the positive side, this work shows that the basic underlying neural
and learning mechanisms of our CA networks are very powerful. In
these experiments, these mechanisms allow us to learn tens of
thousands of complex relationships in a very short time.
The learning mechanism is flexible using both unsupervised learning as
in the IR case, and semi-supervised learning in the categorisation case.
We have also used a compensatory learning rule. Information
theoretically, this is almost the same as TF-IDF. Weak neurons
directly map to words that occur infrequently, while strong neurons to
those that occur frequently. The compensatory rule strengthens weak
neurons, and weakens strong ones. TF-IDF strengths infrequent words,
and weakens frequent words. We do not really understand how the
Hebbian learning rule is implemented in mammalian neurology, and our
IR experiment provides computational support for the use of this rule.
On the negative side, we have not really made use of CAs in this
work. CAs are reverberating circuits of neurons. If we used CAs in,
for example, testing in the IR experiment, we would have presented the
query, and allowed the system to run on for dozens or even hundreds of
cycles. If we did this with the current networks, thousands or even
all of the neurons would come on, and all of the queries would have
the same result. In a more sophisticated system with for instance
inhibitory neurons, there would ideally have been hundreds of CAs that
could have been retrieved. Since we did not do this it is clear that
we do not have a thorough understanding of how to form these CA stable
states. The attractor dynamics that form complex CAs are something we
need to learn more about.
Obviously one type of future work is to use real CAs. This might also
enable us to automatically select the documents in the IR task as we
do in the categorisation task without the need for Pearson's
measurements.
\subsection {CA networks for categorisation}
In the categorisation section \ref {sec-cat}, we pointed out some of
the strengths of our categorisation method. It handles missing data
easily, it can easily be changed to guess any feature, and it shows
confidence in the decision. Additionally, it could easily work for a
large number of categories despite this example using only two.
The sceptic might point out that our categorisation task was an easy problem,
or at least particularly appropriate to our system. Indeed it was. It used
categories that were based on probabilistic features.
The categories were widely separated by those features.
This system has difficulties with traditional problems like
overlapping categories. For instance,
if the there were two types of Democrats and one of the types was very
similar to the Republicans, the systems performance would decrease.
Perhaps the use of full CAs would solve this, but this is merely
speculation. None the less there is scope for further exploration of
CA networks for categorisation.
\subsection {CA networks for Information Retrieval}
We are quite pleased with the positive results of the IR experiments.
We have made minor adjustments to the basic neural model, and the
results are comparable to existing IR work. Training is rapid; we
have trained on the 1400 document Cranfield corpus using 3600 neurons
in under 15 minutes on a low end PC. Somewhat worryingly retrieval is
rather slow because we must do all of the Pearson's comparisons;
however, it still takes under a minute.
Clearly this work is only exploratory. There are many more questions
to explore.
We would like to consider how the number of neurons activated depends
on the number of terms in the query. While some queries with
a small number of terms activate more neurons than ones with
a larger number, generally, more terms will lead to more neurons
being active. Perhaps using full CAs, or even just inhibitory
neurons might change this.
We would like to consider how to scale to deal with larger
collections. The above initial experiments have been carried out using
small test collections. We intend to perform experiments with larger
sets of data. In order to do that we need to optimise our matching
algorithms to reduce processing time.
We would also like to consider how to implement different weighting
schemes. IR research has shown that the weighting scheme has a great
impact on the systems performance \cite {Dumais}. At the moment our
systems simulates a kind of TF-IDF weighting, we aim at experimenting
with other techniques.
Finally, we would like to consider different similarity measures. For
the experiments described in this paper, we used Pearson's correlation
to assess the similarity between queries and documents. We are not
certain that this is the best measure and intend to try other
alternatives such as the widely used cosine correlation.
\section {Conclusion}
Our CA network was designed to model the computationally important
aspects of mammalian neural processing. We know that human neural
processing is complex but it is also more effective than any other
existing computational system. CA networks are based on simple
neurons that are connected by synapses that learn via Hebbian
learning.
This parallel learning and knowledge representation mechanism is
extremely powerful. It allows us to easily implement systems for two
disparate tasks. Our experiments have shown that a CA network
performs near the state of the art on our categorisation and IR
experiments. The ease with which we developed these systems from the
basic CA network model implies that the model should be usable for a
wide range of tasks.
This paper is a two fold example of how neuro-biology and
computational theory can aid each other. Our neurally inspired model
helps us perform real tasks improving our understanding of
computational theory. Neuro-biology can not entirely explain the
nature of the Hebbian learning rule. The computationally motivated
TF-IDF measurement lends support for the compensatory learning rule as
a Hebbian learning rule that the brain should use.
The CA network model is powerful, but we need a better understanding
of it. We can use real world tasks like categorisation and IR to
explore this model, we can continue our work with neural and cognitive
simulations, and we can look at neurobiology and psychology. All of
these things can improve the model. As we have shown in this paper,
the model can also be used to solve real world tasks.
{\bf Acknowledgements:} \\This work was supported by EPSRC
grant GR/R13975/01.
\begin {thebibliography}{99}
\bibitem {Blake} Blake, C., and C. Merz. (1998). UCI Repository of
machine learning databases [http://www.ics.uci.edu/$\sim$mlearn/MLRepository.html].
Irvine, CA. University of California, Department of Information and
Computer Science.
\bibitem {Braitenberg} Bratenberg, V. (1989)
Some Arguments for a Theory of Cell Assemblies in the Cerebral Cortex.
In {\em Neural Connections, Mental Computation} Nadel, Cooper, Culicover and Harnish
eds. MIT Press.
\bibitem {Crouch} Crouch, C. (1988) A Cluster-Based Approach to
Thesaurus Construction. In {\em ACM SIGIR}, Grenoble, France.
\bibitem {Deerwester} Deerwester, S., S. Dumais, G. Furnas,
T. Landauer and R. Harshman, R. (1990) Indexing by Latent Semantic
Analysis. In {\em Journal of the American Society for Information Science},
41(6):1-13
\bibitem {Dumais} Dumais, S. (1991) Improving the Retrieval of Information
from External Sources. In {\em Behaviour Research Methods, Instruments \&
Computers} 23(2) pp. 229-36.
\bibitem {Hebb} Hebb, D.O. (1949) The Organization of Behavior. John Wiley and Sons,
New York.
\bibitem {Hetherington} Hetherington, P., and M. Shapiro. (1993)
Simulating Hebb cell assemblies: the necessity for partitioned dendritic trees
and a post-net-pre LTD rule.
{\em Network: Computation in Neural Systems} 4:135-153
\bibitem {Huyck1} Huyck, C. (2002) Overlapping Cell Assemblies
from Correlators. In {\em Neurocomputing Letters}
\bibitem {Huyck2} Huyck, C. (2002) Cell Assemblies and Neural Network
Theory: From Correlators to Cell Assemblies.
Middlesex University Technical Report ISSN 1462-0871 CS-02-02
\bibitem {Ivancich} Ivancich, J. E., C. Huyck and S. Kaplan. (1999)
Cell Assemblies as Building Blocks of Larger Cognitive Structures. In
{\em Behaviour and Brain Science}. Cambridge University Press
\bibitem {Lesk} Lesk, M. E. (1969) Word-word Associations in Document Retrieval.
In {\em American Documentation} 20(2) 119-48.
\bibitem {Porter} Porter, M. (1980) An Algorithm for Suffix Stripping.
In {\em Program} 14(3) pp. 130-7.
\bibitem {Sakurai} Sakurai, Y. (1998) The search for cell assemblies
in the working brain. In {\em Behavioural Brain Research} 91 pp. 1-13.
\bibitem {Schlimmer} Schlimmer, J. (1987) Concept acquisition through
representational adjustment. Doctoral dissertation, Department of Information
and Computer Science, University of California, Irvine, CA. As reported by
\cite {Blake}.
\bibitem {Smart} SMART English Stoplist (2003) [http://www.cs.utk.edu/
$\sim$lsi/corpa.html]
\end {thebibliography}
\end {document}