Jocelyn Paine

Dr. Dobb's Bloggers

What Might Category Theory do for Artificial Intelligence and Cognitive Science?

April 25, 2010

In August 2008, I posted
a
series of answers to the question "Why should we be
interested in category theory?" on the
A
Categorical Manifesto
thread in the n-Category
Café blog. Category theory
is a mathematical tool
often used to elucidate similarities between apparently
unrelated pieces of mathematics. I suggested it could
do the same for AI and cognitive science, and discussed
examples that include neural nets, holographic reduced representations, Prolog-stye unification, analogical reasoning, and understanding metaphors. Here
is the same posting, with an informal explanation
of category theory added, and the rest made intelligible (I hope) to non-category-theorists.

In August 2008, I posted
a
series of answers to the question "Why should we be
interested in category theory?" on the
A
Categorical Manifesto
thread in the n-Category
Café blog. Category theory
is a mathematical tool
often used to elucidate similarities between apparently
unrelated pieces of mathematics. I suggested it could
do the same for AI and cognitive science, and discussed
examples that include neural nets, holographic reduced representations, Prolog-stye unification, analogical reasoning, and understanding metaphors. Here
is the same posting, with an informal explanation
of category theory added, and the rest made intelligible (I hope) to non-category-theorists.

A
"category"
is a mathematical structure, in the same
sense that a vector space, a group, or a set is.
Vector spaces contain vectors: groups contain
elements that represent symmetry operations;
sets contain points.
And a category contains
"objects" related by
"arrows". Visualise it as
dots connected by arrows.

These objects and arrows must follow
rules. Vector spaces must also: they have to
have a null vector and obey linearity
constraints. Groups must contain
an inverse for each element, and an
identity that leaves elements unchanged when
you multiply by it. And all categories
must follow the rule that if there is
an arrow from object A
to object B, and an going from
object B to object C, there must also be
a composite of these from A to C. The categories
must follow other rules too, but that is
an important one. Visualise the fact that
if you can put a pig through a mincer
to make mince, and mince through a sausage machine to
make sausages, you can put a pig through
(a mincer connected to a sausage machine) to
make sausages.

One can study the consequences of these
rules for their own sake. But many users
of this mathematical "app" treat categories
as mathematical workspaces. The objects will
stand for mathematical entities such as
vector spaces, groups, or sets. Or for
computational entities such as programs, neural
networks, concept hierarchies, or
ontologies. They can even stand for
categories — which makes category theory
horribly and fascinatingly recursive.

An arrow in such a "workspace" category
will usually
stand for some kind of transformation or
function or process or relation. But almost always, if not always,
an arrow doesn't represent any old
transformation or function or process or relation.
It represents a
transformation or function or process or
relation that "preserves structure".

Just how the arrows preserve structure
depends on the category.
Mathematicians will know what I mean
if I say that in the category whose
objects are groups, all arrows
(which by definition, transform one group to another group)
map the identity element in the
source group to the identity element
in the target group. In a category whose
objects were jokes, one would probably want all arrows
(which by definition, would transform one joke to another joke)
to map the punchline in the
source joke to the punchline
in the target joke. In a category
whose objects were trains, one might make all arrows
map the engine of the source train to the
engine of the target train.

The objects and arrows in a category draw a picture
of the structure-preserving transformations that exist
within a coherent collection
of mathematical entities. Things get more
interesting when we have two categories.
For example, we might have a category of
concept hierarchies and a category of neural nets,
and map the hierarchies onto
the nets such that the transformations
between nets mirror the transformations
between hierarchies. Visualise
a Perspex sheet bearing the network
of transformations between hierarchies, and another
sheet bearing the network
of transformations between nets,
and us holding one above the other and drawing a line
from each hierarchy to its counterpart net.

The mathematical tool that handles this is a
"functor". Visualise
it as an
arrow that bundles together all the lines you
have drawn between the sheets.

A functor can be regarded as a structure-preserving
transformation between categories:
that is, as an arrow in the category of categories,
which maps the category of
hierarchies onto the category of
nets. Visualise the category of
categories as dots and arrows. Each dot is a Perspex
sheet like the ones mentioned above.
You have drawn lines between the sheets, as above. Each arrow
is a bundle of all the lines belonging to
one transformation between a pair of sheets.

This works
because it maps the parts of the category of
hierarchies (whose parts are the hierarchies)
onto the parts of the category of
nets (whose parts are the nets)
in the same consistent way that I
described for mapping group identities
onto group identities, engines onto engines,
or punchlines onto punchlines.

I said that category theory is insanely
recursive. I have introduced transformations
between categories: namely functors.
But one can also have transformations
between functors: that is, transformations
between transformations
between categories. These are called "natural
transformations".

Consider our hierarchies and nets
again, but now assume we have type-A nets
and type-B nets. The category of hierarchies is
as before; the category of nets contains the type-A
nets and the type-B nets. Visualise the hierarchies
as green dots in the hierarchies category, and the type-A's
as red dots and the type-B's as blue dots in the nets
category. Then there is a bundle of arrows mapping the green
dots to the red dots. That's one functor, which I'll
call A. There is another bundle of arrows mapping the green
dots to the blue dots. That's another functor, which I'll
call B. If there is a natural transformation between A and B,
it means that they are compatible in a special and
particularly useful way. I'll return to this
in my section on neural nets.

A very general reason that categories
will be useful is
that
they'll help us understand the "Interesting
Truths" of computer science: the "subtle unifying
insights between broad classes of mathematical structures",
to quote the paragraph from SF author
Greg Egan's
novel Incandescence that
I once posted
to the n-Category Café:

‘Interesting Truths' referred to a kind of
theorem which captured subtle unifying insights between broad
classes of mathematical structures. In between strict
isomorphism — where the same structure recurred
exactly in different guises — and the loosest
of poetic analogies, Interesting Truths gathered together
a panoply of apparently disparate systems by showing them
all to be reflections of each other, albeit in a suitably
warped mirror. Knowing, for example, that multiplying
two positive integers was really the same as adding
their logarithms revealed an exact correspondence
between two algebraic systems that was useful, but
not very deep. Seeing how a more sophisticated
version of the same could be established for a
vast array of more complex systems — from
rotations in space to the symmetries of subatomic
particles — unified great tracts of physics and
mathematics, without collapsing them all into mere copies of a single
example.

I hope you can see from my
description of functors
as drawing a picture of one category in another,
and of natural transformations as transforming
one such drawing into another, why Egan
wrote that passage. It applies to all of mathematics.
But more specifically, I'm
interested in categories because I think they have huge
— and under-used — potential in
understanding human cognition and in building
"intelligent" computers.

My first example concerns understanding how "logical
unification" and "holographic reduced
representations" are related.
Logical
unification is a kind of pattern matching.
(It's often just called "unification",
but I'll preface with "logical" to avoid confusion with
"unify"
as in "insight". Logicians reading this, please forgive such a
cursory description.)

For instance, in Prolog, unifying the term
2*x+3*y
with the term U+V, will bind the variable U to 2*x and the variable
V to 3*y. Prolog programmers love this, because it's such an
easy way to get at pieces of structures.

But — computer science also
knows a different kind of pattern matching,
implemented by so-called "holographic reduced
representations" or HRRs. These encode structures as
bitwise or numeric vectors in very high-dimensional spaces,
using circular convolution and circular correlation (or analogous
operations) to build and decompose vectors. I'll
show an example in the next paragraph; for those who want
to know more, these references give some background:

Here's an example from Kanerva's paper.
The following equation computes a bitwise
vector encoding data about France:

France = [Capital*Paris + Location*WesternEurope + Currency*Euro]

Here, + is vector addition, * is exclusive-or. The brackets
[] denote a "normalised mean", whose exact
definition is given in Kanerva's paper. Capital, Location and
Currency are three randomly-generated vectors. The vector
Capital represents a slot or field whose value
will represent the capital of France; similarly
for the vectors Location and Currency.

Paris, WesternEurope and Euro are three more
randomly-generated vectors, representing
the values of these slots. It's important
to note that all six vectors should be interpreted as
atomic symbols: the actual values of their components have no
significance.

Having computed the vector France, we
can "probe" it by exclusive-oring
with the vector Capital:

This works because * (that is, exclusive-or) is its
own inverse, and because it distributes over the normalised mean [].

The final stage is to do what Kanerva calls a
"clean-up". You assume there is an
associative memory that contains the six vectors originally
generated. Fed any vector, it will quickly return
whichever of the six closest to it. Which in this case,we hope will be
Paris.

For this to work, various conditions on the
distribution of vectors must be satisfied. But if they
are, you can do some really nifty analogical reasoning.
For example, you can ask how France is
related to Paris by exclusive-oring the vector
France with the vector Paris. This works the same way
as above, and after the clean-up, will return Capital.
Kanerva goes on to describe more ambitious reasoning
tasks, such as forming a vector that represents
Sweden, then asking "What is the Paris of Sweden"?

On the surface, holographic reduced
representations are utterly different from
logical unification. But I can't help
feeling that, at a deeper level, they are
closely related. And there is a categorical formulation of logical
unification, described in the
first reference below, by Rydeheard and Burstall. They say their
formulation is derived from an observation by Goguen. So it may be based
(I'm not an expert) on the ideas in the second reference:

So, can we extend that categorical formulation
to holographic reduced representations? I don't know.
But if we could, we would better understand
how they are related to logic programming, and
we might gain new tools for
analogical reasoning. It's worth trying.

Here's my second example: neural networks, and
logic-based representations. I suppose most readers have come across
neural networks, but for those
who haven't, I'll describe
them briefly. Neural
networks consist of "nodes"
connected in a network. The connections between nodes correspond to
synapses
between biological neurons.
Typically, a node sums the signals from its input nodes,
multiplying each input signal by the strength of the corresponding
connection. It may then process the sum in some way — e.g. by
thresholding — to
calculate an output and an "activation". The entire network
can be
regarded as a vector of these activation values, representing the data in
the network.

Most kinds of neural network can learn by adjusting the
strengths of their connections, thus mimicking —
in a very simplified way — how the brain learns by
adjusting synaptic strengths.

In Artificial Intelligence, there is a vast cultural
chasm between neural nets and symbolic ways to represent
knowledge. It's rare for the same researcher to work
on both, and students usually get taught about them in
different courses. This reflects the differences between
the two. The symbols in symbolic representations denote
things. But in some kinds of neural net, the individual
nodes don't, although a whole set of nodes might.
Therefore, such nets are often called "subsymbolic",
a word introduced by Paul
Smolensky in the following paper:

As well as the "symbolic/subsymbolic divide",
there is a "compositional/non-compositional divide". Symbolic
ways to
represent knowledge
are "compositional", in that
the meaning of a sequence of symbols is
determined by the meanings of its parts.
Compositionality is good: if you can
represent "coffee cup"
and "cake", it's nice to be able to represent
"coffee cup and cake" too. But if
you have (say) an image-processing
net which has learnt to represent
images of coffee cups by one pattern of
activations, and images of cakes by another,
you can't assume that
superimposing or adding the patterns will
represent a coffee cup and a piece of cake. So
there has been a lot of argument about questions like:
is the brain compositional; is composition by
superimposing activation patterns the only kind of
compositionality; and (as engineers) how can we design
nets that are compositional?

Yet another difference is that symbolic ways
to represent knowledge are often said to be
"contextual": they are leaky, in that
what happens at one node can be very
sensitive to data elsewhere in the network.
Which is different from my Prolog program
that will happily unify 2*x+3*y with
U+V, giving the same results no matter
what else is in my laptop's memory.
Again, there's been a lot of argument about this.
Unfortunately, a lot of the key papers on
this and the other topics, though on-line,
are locked up in pay-to-view sites.
The reference below seems a good, and free, starting point:

I mention these differences
not only to show how far apart neural nets
are from symbolic systems, but also because category
theory must have a lot to offer in answering such
questions. Which is why the two references
below are exciting:

In the first, Brown and Porter
suggest using colimits to
understand brain activity in terms of the component structures and
neurons. A "colimit" is an operation
that takes a collection of objects and arrows, and in some
sense generates the smallest object that contains all the
information they hold but no more. Colimits can be used
to model "putting together" or "glueing" of all kinds of
things — sets, groups, categories, program
specifications,
spreadsheets, programs, ontologies — and this is what
Brown and Porter want them for.

Healy goes further. He sets up a
category Conc whose objects are theories
representing concepts. Arrows represent "is a
subconcept of": there is an arrow from c1 to c2 if
c1 is a subconcept of c2. He sets up another category Neur, whose
objects are nodes in a particular neural network.
Arrows represent the "can activate"
relationship: there is an arrow from n1 to n2 if n1
can activate n2. Such an arrow tells us
something about the neural network, namely
that there is a path between n1 and
n2, and that all the synaptic connections
along this path are strong enough to
propagate a signal from n1 to n2.

This wouldn't be interesting unless the categories Conc and Neur
were related, and indeed, Healy lets us define functors from Conc to Neur.
In effect, each functor maps a network of concepts, represented as
theories, to regions in a neural network where these concepts are
represented. You can now see where I took some of my introductory
examples from.

What's nice is that we can combine these neural-network
representations by using natural transformations. I hope I have this
right, because Healy doesn't give much detail, but the point seems
to be that a single neural network might learn different instances of the
same concept.

For example, imagine an ambidextrous robot with mirror-identical arms and
hands. Each hand has touch sensors: the left hand's sensors are
wired to one set of input nodes in the network, and the right hand's
to another set of input nodes. Assume the robot is taught to grasp an
orange with its left hand, and then, independently, with its right hand.
The region of its neural network fed from the left hand sensors learns a
set of associations between finger pressure and movement; the region fed
from the right hand learns, independently, a very similar set. This is an
example I made up, by the way, so blame me if it turns out to be
misleading.

The left hand and right hand associations can then be regarded as two
functors, L and R, from Conc to Neur. I'm not sure what the logical
content of the objects in Conc would be — theories describing what
an orange is for, or what its shape is, or how to grasp it — but
at any rate, they'd say something about oranges, and they'd be
represented in two different regions in the neural network. We can merge
these, Healy says, by setting aside a third region in the network which
combines corresponding outputs from each of the first two.

But is this in fact a valid representation of the learnt concepts? Yes, if
we let natural transformations guide us. As well as functors L and R, let
there be a functor M which maps Conc to this third region. Set up mappings
from L to M and R to M. Then, Healy shows, the third region is a valid
representation if these mappings are natural transformations.

What does this mean?
Going back
to my explanation of natural transformations in terms of green and red and
blue dots, the concept hierarchy is a network of concepts painted
in green. Functor L maps it to a picture of this network implemented in
neural nets and painted in red.
Functor R maps it to a different picture of this network implemented in
neural nets and painted in blue.
And if L and R form a natural transformation, it means they
are compatible in a special way, such that we can paint yet another
picture in purple, that combines the information in the red
picture with the information in the blue picture.

Can such ideas be applied to other kinds of neural network? Can we show
other kinds of neural network to be reflections, in suitably warped
Eganian mirrors, of logical theories? Can we characterise neural networks
by properties of the categories containing them? (I suspect many of these
categories won't have products, because of interference between
subnetworks, but that's only a passing thought.)

And — very importantly, because we engineers have such small
minds, so need to break problems into tiny pieces before we can understand
them — can we use colimits and other tools to help us build
modular neural nets?

This states that: "A physical symbol
system has the necessary and sufficient
means for intelligent action". I suppose every
hacker believes this, and hopes in their deepest soul that,
one day, they will write the program that shows
this to be true.

More recently, various researchers have
proposed using dynamical systems ideas to
understand cognition. This is the Dynamical
Systems Hypothesis, and there's some
background on it in these references:

I was reminded of this by a nice paper
I found, written by David
Tall about the learning of mathematical concepts. He wrote
it before the Dynamical Systems Hypothesis became
popular, but it uses dynamical systems ideas to
explain how learning one concept might be blocked
by a conflicting earlier version of the same concept:

So, in a spirit similar to that of
my neural network section, I ask: can we use
category theory to help us map between logical
descriptions of knowledge and their implementation as
dynamical systems. Can we use it to help
us build modular dynamical systems implementations?

My fourth example
ratchets up the stakes, and turns
— I'll explain why in a minute — to
Margaret Thatcher. In the 1980s, she was Prime
Minister of Britain, and Denis Thatcher was her husband.
Ronald Reagan was President of America. So, if you were to
ask an American "Who is First Lady of
Britain?", what could they reply?
A reasonable answer is "Denis Thatcher".
But to make it, you must be willing to "slip"
the concept of First Lady, weakening it to First Spouse as you
transfer from an American frame of reference to a British.

Now consider the following two number patterns:

A: 1 2 3 4 5 5 4 3 2 1

B: 1 2 3 4 4 3 2 1

What is to B as 4 is to A?
Most people would answer 3. It's to the
left of the central pair of numbers in B, just
as 4 is to the left of the central pair of numbers
in A. 4 in A and 3 in B are like Nancy Reagan and
Denis Thatcher: they're not the "most
distinguished point" in their landscape,
but they are closely related to that point.
They fill the same rôle.

Here is another number problem:

A: 1 2 3 4 5 5 4 3 2 1

D: 1 1 2 2 3 3 4 4 5 4 4 3 3 2 2 1 1

What is to D as 4 is to A?
A brute-force attempt to find D's
central pair won't work. But it seems
reasonable to "slip" the concepts
of "pair" and "singleton"

as we transfer from A's frame of
reference to D's. Which gives us the answer 4 4.

These problems come from a column
that Douglas
Hofstadter wrote for Scientific American. He collected them,
with later
additions, into a book, and I'm particularly
interested here in two of its chapters:

Douglas Hofstadter, Metamagical Themas,
Basic Books, 1985. See Variations on a Theme as the Crux of
Creativity (Chapter 12) and

Analogies and Roles in Human and
Machine Thinking (Chapter 24).

Hofstadter argues that analogy is at the core of
human thought and creativity. To quote from Chapter 24:

Don't press an analogy too far,
because it will always break down. In that case,
what good are analogies? Why bother with them?
What is the purpose of trying to establish a
mapping between two things that do not
map onto each other in reality? The answer is
surely very complex, but the heart of it
must be that it is good for our survival
(or our genes' survival) , because we
do it all the time. Analogy and reminding, whether they
are accurate or not, guide all our thought patterns.
Being attuned to vague resemblances is the hallmark of intelligence,
for better or for worse.

Therefore, he says, human analogical thought
is what we need to study. And we should do so by
within "microdomains" such as the
number problems, because then — in proper
scientific spirit — we can isolate
analogical thought from other phenomena.

He was particularly interested in why we
slip some concepts more than others, and in
what one might call rôles versus literals. As an
example of the first: I'm in the pub, and
I accidentally spill my pint of beer over my shoes.
Why is it natural for me to "slip" my
situation to "Good job I don't have
to go anywhere smart tonight!",
but not to "Shame gravity isn't
weaker, so the beer would have stayed in the air!"?

As an example of the second,
why, when solving the two number problems,
do we perceive 4 as filling a rôle at all in
pattern A, then try to find a value for the
equivalent rôle in patterns B and D?
Why don't we just say, "Well, the
obvious answer is the same value, 4"?

There are some lovely examples
in those two chapters, which I do
recommend to anyone interested. But what's the connection with
categories?

Hofstadter and his colleagues
wrote several programs to implement
their theories of analogical thought
in microdomains. These programs include include Copycat:

Page
on "Letter Analogies", Center for Research on Concepts and
Cognition
(also known as the Fluid Analogies
Research Group), Indiana University, 2002.

Robert French, The Subtlety of Sameness: a
Theory and Computer Model of Analogy-Making, MIT Press, 1995.

Doesn't that title sound like something to interest
every red-blooded n-categorist? Anyway, Tabletop
solves problems such as "I am sitting at a
table and I touch my nose. What should you touch?".
Or "I touch the flower vase in the centre.
What should you touch?". Or, one more: "I
touch the cup near me. You have a glass but no cup. What should you
touch?"

Now, these programs are complicated.
They contain swarms of biologically-inspired enzyme-like entities
swimming around in a kind of
symbol-filled cytoplasmic glop, glomming onto bits of concepts, and slowly
crystallising out into a finished answer to the problem. They
lack — as far as I know — any easily understandable
semantics.

So, I'm going to suggest, can we
use categories to give them such a semantics? Or, more generally, to
formalise our ideas about the best
solutions to the kind of analogy problems that
Hofstadter and his colleagues discuss? It seems to
me that a good starting point would be some papers
by Joseph Goguen
on something he called "conceptual blending". In these,
he uses colimits and colimit-like operations. As I mentioned when
discussing neural nets, these
are often used to model
"putting together". And in "conceptual blending", Goguen models the putting together
of concepts to make metaphors. For example putting
together the meaning of "house" and the meaning
of "boat" to make the meaning of "houseboat":

I shall finish this section with two more Hofstadter quotes.
As I said at the beginning, the stakes are high:

[…] I have been emphasising the idea of the internal
structure of one concept. In my view, the way that
concepts can bond together and form conceptual molecules on all
levels of complexity is a consequence of their internal structure.
What results from a bond may surprise us, but it will nonetheless
always have been completely determined by the concepts
involved in the fusion, if only we could understand how
they are structured. Thus the crux of the matter is the
internal structure of a single concept, and how it
"reaches out" towards things it is not.
The crux is not some magical, mysterious
process that occurs when two indivisble concepts collide;
it is a consequence of the divisibility of concepts into
subconceptual elements. As must be clear from this, I am
not one to believe that wholes elude descriptions in
terms of their parts. I believe that if we come to understand the
"physics of
concepts", then perhaps we can derive from it a
"chemistry of creativity", just as we can derive the
principles of the chemistry of atoms and molecules
from those of the physics of quanta and particles.
But as I said earlier, it is not just around the corner.
Mental bonds will probably turn out to be no less
subtle than chemical bonds.

You can think of concepts as start,
and knob-twiddling as carrying you from one
point on an orbit to another point. If you twiddle
enough, you may well find yourself deep within the
attractive zone of an unexpected but interesting concept
and be captured by it. You may thus migrate from concept to
concept. In short, knob-twiddling is a device that carries
you from one concept to another, taking advantage of their
overlapping orbits.

Of course, all this cannot happen
with a trivial model of concepts. We see it happening
all the time in minds, but to make it happen in computers
or to locate it physically in brains will
require a fleshing-out of what concepts really
are. It's fine to talk of "orbits around
concepts" as a metaphor, but developing it into a
full scientific notion that either can be
realised in a computer model or can be located inside a
brain is a huge task. This is the task that faces
cognitive scientists if they wish to make "concept" a
legitimate scientific term. This goal, suggested at the
start of this article, could be taken to be the central
goal of cognitive science, although such things are
often forgotten in the inane hoopla that is surrounding
artificial intelligence more and more these days

"These days" were in the 1980s,
so perhaps we can disregard the remark about
the inane hoopla. Though there seems to be a lot of
inane hoopla surrounding computing in general.
But what about the rest of the quotes. Does it
make sense to think about a
"chemistry of concepts"; to
make "concept"
a legitimate scientific notion?

Actually, Michael Healy, who
I mentioned earlier for his
work on mapping concepts to neural
nets, has with coauthors
Thomas Caudell and Timothy Goldsmith, released another report:

Categorization and the judgement of similarity are
fundamental in cognition. We propose that these and other
activities are based upon an underlying structure of knowledge,
or concept representation, in the brain. Further, we propose
that this structure can be represented mathematically in a
declarative form via category theory, the mathematical
theory of structure. We test the resulting mathematical
model in an experiment in which human subjects provide
judgements of similarity for pairs of line drawings
using a numerical scale to represent degrees of
similarity. The resulting numerical similarities
are compared with those derived from the
category-theoretic model by comparing
diagrams. The diagrams represent distributed
concept structures underlying the line drawings.
To compare with a more conventional analysis
technique, we also compare the human
judgements with those provided by a two-dimensional feature space model
equipped with a distance metric for the line drawings.
The results are equally favorable for both models.
Because of this and the putative explanatory power of the
category-theoretic model, we propose that this model is
worthy of further exploration as a
mathematical model for cognitive science.

Which is extremely interesting, and
reminiscent of Hofstadter's remark: "Thus the
crux of the matter is the internal structure of a single
concept […]".

I'll finish by talking
about a paper by
the famous
category theorist William Lawvere.
It's the following
reference — which, I'm pleased to note,
is public at Google Books — and
I mention it because I found it in a book on cognition:

Despite some simplification in the above,
needed for rapid description, I hope that I have made
clear that there is a great deal of useful precision
lying behind my illustrations, and a great deal to be
developed on the same basis. Thus I believe to have
demonstrated the plausibility of my thesis
that category theory will be a necessary
tool in the construction of an adequately explicit science of knowing.

I don't know enough to
describe these "illustrations"
succinctly, but they include a notion
that he often refers to, the
"unity-and-identity-of-opposites", which I
discovered from this reference:

is also known as an adjoint cylinder.
I mention this only because it sounds like something wonderful.

Lawvere's illustrations also include the
notion of an "adequacy comonad".
I hadn't come across these before, but
Lawvere's explanation of them
sounds as though they might be very useful in understanding
approximations to concepts. Moreover, the adequacy
comonad sounds like an exquisite and intricate
piece of mathematical machinery,
and I hope someone will give me a working model.
One I can pick up, prod, peer into, pull to pieces, and
generally play with. Steam or electric.

However, I've another reason for
mentioning the notion. Lawvere uses it to show
how the study of a kind of
category called a
"closed category" contributes, via
adequacy comonads,
to
the study of "metric spaces". These are mathematical
structures that consist of points and
a way of defining the distance between them:
mathematicians use them in investigating the
concept of distance, abstracted away
from other concepts.

He then says, if I understand him correctly,
that, conversely, we can take
notions such as "radius" and
"engulfing" from
metric spaces and
create precise analogues of these notions
within certain kinds of category. His
wording is technical, so I've omitted
it from this
Dobbs blog version of my
answers, but you can find it in my
original
n-Category Café posting.

Perhaps Lawvere is using the words "radius"
and "engulfing" in some sense far removed
from their everyday meaning. But if not, can his
remark also help us import other spatial and dynamic notions
into category theory?
Such as "orbit"
(around a concept); "attractive zone"
(of a concept); "migrate"
(from concept to concept).

In other words, can we use it to
import into category theory,
and thus into categories of neural nets and other
cognition-related entities,
the metaphors that Hofstadter says must
be made precise before we can
create a "chemistry of concepts"? If so,
then category theory will be an invaluable tool for cognitive science.

Even if I'm completely wrong in
that last paragraph, Lawvere says
category theory "will be a necessary
tool in the construction of an adequately
explicit science of knowing", which is
good enough for me. So I did an experiment. I searched
Google and CiteseerX on his main title,
Tools for the Advancement of Objective Logic.

Apart from citations in library catalogues, and
papers on philosophy or pure category theory, I
found only two authors whose papers seemed,
on a quick skim, to depend in some important
way on Lawvere's paper. For interest,
I give the references below. (The second author,
Zippora Arzi Gonczarowski, had one paper on a pay-to-view
site that I couldn't read, but the abstract
shows me that it's about the same topic as the paper below):

Lawvere's paper seems filled
with crystals of compressed insight, glittering
through the surface of the text like quartz
through a covering of dry pine-needles. I
find them intensely satisfying to contemplate,
in the same way I do some
El
Lissitzky posters and De
Stijl typography.
Perhaps, high in my brain's
abstraction hierarchy, the
same neurons are activated by both: category
theory as the ultimate abstract sculpture.

But, aesthetics aside, how can we bring
such power to more researchers than the
two who cited it? I'm using Lawvere's paper
as a representative of category theory
generally, of course. Ask a first-year
LISP hacker to iterate over a list,
and they'll reach naturally for
recursion or a MAP function. Can we bring up a generation of
category-theory hackers, who,
when you ask them to model putting
something together, will reach
naturally for a colimit?

I thought I'd add one final
reason for my interest in category theory. Spreadsheets!

Excel spreadsheets are a big source of
errors. One reason is that Excel
doesn't have the notion of "modules":
chunks of program that can be written,
tested and debugged one at a time, then used
in as many different spreadsheets as you want.
If you develop a spreadsheet in Excel, and want to use part of it in
another
spreadsheet, you have to copy and paste cells from the first
spreadsheet to the second. If you then need to change these cells
— perhaps because they have bugs, or perhaps to make the formulae
in them faster — you have to make the same changes to both
spreadsheets. You can't just change one
component, and have the changes propagated to
every use of it. So Excel programmers tend to
accumulate numerous almost-identical copies
of their code. Version-control becomes a nightmare.

I've been experimenting with ways to
overcome this, via various schemes for modularising
spreadsheets. And below are references to a demo
of one such scheme — a prototype component
library for Google's online spreadsheet,
which lets you add code to a spreadsheet much as you
would bar-charts and other displays — and
to a paper about the component library for Google's
spreadsheet and for Excel. Both are for spreadsheet users, so not very
technical:

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task.
However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Video

This month's Dr. Dobb's Journal

This month,
Dr. Dobb's Journal is devoted to mobile programming. We introduce you to Apple's new Swift programming language, discuss the perils of being the third-most-popular mobile platform, revisit SQLite on Android
, and much more!