Symbol Grounding and the Origin of Language

ABSTRACT: What language allows us to do is to "steal"
categories quickly and effortlessly through hearsay instead of having to
earn them the hard way, through risky and time-consuming sensorimotor "toil"
(trial-and-error learning, guided by corrective feedback from the consequences
of miscategorisation). To make such linguistic "theft" possible, however,
some, at least, of the denoting symbols of language must first be grounded
in categories that have been earned through sensorimotor toil (or else
in categories that have already been "prepared" for us through Darwinian
theft by the genes of our ancestors); it cannot be linguistic theft all
the way down. The symbols that denote categories must be grounded in the
capacity to sort, label and interact with the proximal sensorimotor projections
of their distal category-members in a way that coheres systematically with
their semantic interpretations, both for individual symbols, and for symbols
strung together to express truth-value-bearing propositions.

Meaning: Narrow and Wide

Philosophers tell us that there are two different approaches to explaining
what our thoughts are about, one wide and one narrow. According to the
wide approach, the object I am thinking about, that physical thing out
there in the world, is to be reckoned as part of the meaning of that thought
of mine about it; meaning is wider than my head. According to the narrow
approach, the locus of any meaning of my thoughts is inside my head; meaning
can be no bigger than my head. The question is: should psychologists who
aspire to study and explain the meaning of thoughts adopt a wide or a narrow
approach?

Here is an advantage of a wide approach: As wide meaning encompasses
both the internal thought and its external object, a complete explanation
of wide meaning would leave nothing out. Once you have arrived at a successful
explanation, there are no further awkward questions to be asked about how
thoughts get "connected" to what they mean, because what they mean is in
a sense already part of what thoughts are. But there are disadvantages
for a psychologist adopting this approach, because it would require him
to be so much more than a psychologist: He would have to be an authority
not only on what goes on in the head, but also on what goes on in the world,
in order to be able to cover all the territory over which thoughts can
range.

We can illustrate the plight of the wide psychological theorist with
an analogy to the roboticist: If one were trying to do "wide robotics,"
accounting not only for everything that goes on inside the robot, but also
for what goes on in the world, one would have to model both the robot and
the world. One would have to design a virtual world and then describe the
robot's states as encompassing both its internal states and the states
of the virtual world in which it was situated. In this ecumenical era of
"situated" robotics (Hallam & Malcolm 1994) this might not sound like
such a bad thing, but in fact wide robotics would be quite contrary to
the spirit and method of situated robotics, which emphasises using the
real world to test and guide the design of the robot precisely because
it is too hard to second-guess the world in any virtual-world model of
it (the world is its own best model). At some point the "frame problem"
(Pylyshyn 1987; Harnad 1993) always arises; this is where one has failed
to second-guess enough about the real world in designing the robot's virtual
world, and hence a robot that functions perfectly well in the virtual world
proves to be helpless in the real world.

So perhaps it's better to model only the robot, and leave the modeling
of the real world to cosmologists. The counterpart of this moral for the
psychologist would be that he should restrict himself to the narrow domain
of what's going on in the head, and likewise leave the wide world to the
cosmologists.

The Symbol Grounding Problem

But there are disadvantages to the narrow approach too, for, having
explained the narrow state of affairs in the head, there is the problem
of ensuring that these states connect with the wide world in the right
way. One prominent failure in this regard is the symbolic model of the
mind, according to which cognition is computation: thinking is symbol manipulation
(Pylyshyn 1984). After the initial and promising successes of Artificial
Intelligence, and emboldened by the virtually universal power that was
ascribed to computation by the Church-Turing Thesis (according to which
just about everything in the world can be simulated computationally), computationalism
seems to have run aground precisely on the problem of meaning: the symbol
grounding problem (Harnad 1990a). For in a narrow computational or language-of-thought
theory of meaning, thoughts are just strings of symbols, and thinking is
merely the rule-governed manipulation of those symbols, as in a computer.

But the symbols in such a symbol system, although they are systematically
interpretable as meaning what they mean, nevertheless fail to "contain"
their meanings; instead, the meanings reside in the heads of outside interpreters
who use the computations as tools. As such, symbol systems are not viable
candidates for what is going on in the heads of those outside interpreters,
on pain of infinite regress.

So cognition must be something more than computation. Was the fatal
failing of computation that it was a narrow theory, operating only on the
arbitrary shapes of its internal symbols, and leaving out their external
meaning? Do we have no choice but to subsume the wide world after all?

Subsuming the wide world has not been the response of the cognitive
modeling community. Some have chosen to abandon symbols and their attendant
grounding problems altogether (Brooks 1991), and turn instead to other
kinds of internal mechanisms, nonsymbolic ones, such as neural nets or
pther dynamical systems (Van Gelder 1998).

Others have held onto symbol systems and emphasised that the solution
to the grounding problem is to connect symbol systems to the world in the
right way (Fodor 1987, 1994). The problem with this approach is that it
provides no clue as to how one is to go about connecting symbol systems
to the world in the right way, the way that links symbols to their meanings.
Hence one does have reason to suspect that this strategy is rather like
saying that the problem of a robot that has succeeded in its virtual world
but failed in the real world is that it has not been connected with the
real world in the right way: The notion of the "right connection" may conceal
a multitude of sins of omission that in the end amount to a failure of
the model rather than the connection. (Or, to put it another way, the hardest
problems of cognitive modelling may be in designing the robot's internal
states in such a way that they do manage to connect to the world in the
"right way.")

But let us grant that if the symbolic approach ever succeeds in connecting
its meaningless symbols to the world in the right way, this will amount
to a kind of wide theory of meaning, encompassing the internal symbols
and their external meanings via the yet-to-be-announced "causal connection."
Is there a narrow approach that holds onto symbols rather than giving up
on them altogether, but attempts to ground them on the basis of purely
internal resources?

There is a hybrid approach which in a sense internalises the problem
of finding the connection between symbols and their meanings; but instead
of looking for a connection between symbols and the wide world, it looks
only for a connection between symbols and the sensorimotor projections
of the kinds of things the symbols designate: It is not a connection
between symbols and the distal objects they designate but a connection
between symbols and the proximal "shadows" that the distal objects cast
on the system's sensorimotor surfaces (Harnad 1987; Harnad et al., 1991).

Categorization: A Robotic Capacity

Such hybrid systems place great emphasis on one particular capacity
that we all have, and that is the capacity to categorise, to sort the blooming,
buzzing confusion that reaches our sensorimotor surfaces into the relatively
orderly taxonomic kinds marked out by our differential responses
to it -- including everything from instrumental responses such as eating,
fleeing from, or mating with some kinds of things and not others, to assigning
a unique, arbitrary name to some kinds of things and not others (Harnad
1987).

It is easy to forget that our categorisation capacity is indeed a sensorimotor
capacity. In the case of instrumental responses, based on the Gibsonian
invariants "afforded" by our sensorimotor interactions with the world,
what we tend to forget is that these nonarbitrary but differential responses
are actually acts of categorisation too, partitioning inputs into those
you do this with and those you do that with. And in the case of unambiguously
categorical acts of naming, what we forget is that this is a sensorimotor
transaction too, albeit one subtended by an arbitrary response (a symbol)
rather than a nonarbitrary one.

So the capacity to sort our sensorimotor projections into categories
on the basis of sensorimotor interactions with the distal objects of which
they are the proximal projections is undeniably a sensorimotor capacity;
indeed, we might just as well call it a robotic capacity. The narrow hybrid
approach to symbol grounding to which I referred attempts to connect the
proximal sensorimotor projections of distal objects and events to either
the instrumental responses or the arbitrary names that successfully sort
them according to what is adaptive for the hybrid system. The instrumental
part is just an adaptive robot; but the arbitrary category names (symbols)
open up for the system a new world of possibilities whose virtues are best
described as those of theft over honest toil:

Acquiring Categories by Sensorimotor Toil and Darwinian Theft

Before defining sensorimotor toil vs. symbolic theft, let me define
a more primitive form of theft, which I will call "Darwinian theft." In
each case, what are being either stolen or earned by honest toil are sensorimotor
categories. It is undeniable that we have the capacity to detect
the proximal sensorimotor projections of some distal categories without
ever having to "earn" that capability in any way, because we are born that
way. Just as the frog is born with its Darwinian legacy of bug-detectors,
we also arrive with a "prepared" repertoire of invariance-detectors that
pick out certain salient shapes or sounds from the otherwise blooming,
buzzing confusion reaching our sensorimotor surfaces without our first
having to learn them by trial and error: For sensorimotor trial and error,
guided by corrective feedback from the consequences of miscategorising
things, is what I am calling "honest toil." [FOOTNOTE 1 HERE]

We must infer that our brains, in addition to whatever prepared category-detectors
they may be born with, are also born with the capacity to learn to distinguish
the sensorimotor projections of members from those of nonmembers of countless
categories through supervised learning, i.e., through trial and error,
guided by corrective feedback from the consequences of categorising correctly
and incorrectly (Harnad 1996b). The learning mechanism that learns the
invariants in the sensorimotor projection that eventually allow the system
to sort correctly is not known, but neural nets are a natural candidate:
Learning to categorise by honest toil is what nets seem to do best (Harnad
1993b). [FOOTNOTE 2]

Does a system such as the one described so far -- one that is able to
sort its proximal projections, partly by Darwinian theft, partly by sensorimotor
toil -- suggest a narrow or a wide explanation of meaning (insofar as it
suggests anything about meaning at all)? It seems clear that whatever distal
objects such a system may be picking out, the system is working entirely
on the basis of their proximal projections, and whatever connection those
may have with their respective distal objects is entirely external to the
system; indeed, the system would have no way of distinguishing distal objects
if the distinction were not somehow preserved in their proximal projections
-- at least no way without the assistance either of sensorimotor aids of
some kind, or of another form of theft, a much more important one, to which
I will return shortly.

Categories Are Context-Dependent, Provisional, and Approximate

So for the time being we will note that the categories of such a system
are only approximate, provisional, context-dependent ones: They depend
on the sample of proximal projections of distal objects the system happens
to have encountered during its history of sensorimotor toil (or what its
ancestors happen to have encountered, as reflected by the detectors prepared
by Darwinian theft): whatever systematic relation there might be between
them and their distal objects is external and inaccessible to the system.

Philosophers tend, at this juncture, to raise the so-called "problem
of error": If the system interacts only with its own proximal projections,
how can its categories be said to be right or wrong? The answer is inherent
in the very definition of sensorimotor toil: The categories are learned
on the basis of corrective feedback from the consequences of miscategorisation.
It is an error to eat a toadstool, because it makes you sick. But whether
the proximal projection of what looks like a safely edible mushroom
is really an edible mushroom, or a rare form of toadstool with an indistinguishable
proximal projection that does not make you sick but produces birth defects
in your great-grandchildren, is something of which a narrow system of the
kind described so far must be conceded to be irremediably ignorant. What
is the true, wide extension of its mushroom category, then? Does its memebership
include only edible mushrooms that produce no immediate or tardive negative
effects? Or is it edible mushrooms plus genetically damaging toadstools?
This is a distal difference that makes no detectable proximal difference
to a narrow system such as this one. Does it invalidate a proximal approach
to meaning?

Let's ask ourselves why we might think it might invalidate proximal
meaning. First, we know that two different kinds of distal objects
can produce the same proximal projections for a system such as this one.
So we know that its narrow detector has not captured this difference. Hence,
by analogy with the argument we made earlier against ungrounded symbol
systems, such a narrow system is not a viable candidate for what is going
on in our heads, because we ourselves are clearly capable of making the
distal distinction that such a narrow system could not make.

Distinguishing the Proximally Indistinguishable

But in what sense are we able to make that distal distinction? I can
think of three senses in which we can. One of them is based, trivially,
on sensorimotor prostheses: an instrument could detect whether the mushroom
was tardively carcinogenic, and its output could be a second proximal projection
to me, on whose basis I could make the distinction after all. I take it
that this adds nothing to the wide/narrow question, because I can sort
on the basis of enhanced, composite sensorimotor projections in much the
same way I do on the basis of simple ones.

The second way in which the distal distinction might be made would be
in appealing to what I have in mind when I think of the safe mushroom
and its tardively toxic lookalike, even if it were impossible to provide
a test that could tell them apart. I don't have the same thing in mind
when I think of these things, so how can a narrow system that would assign
them to the same category be a proper model for what I have in mind? Let
us set aside this objection for the moment, noting only that it is based
on a qualitative difference between what I have in mind in the two cases,
one that it looks as if a narrow model could not capture. I will return
to this after I discuss the third and most important sense in which we
could make the distinction, namely, verbally, through language.

We could describe the difference in words, thereby ostensibly picking
out a wide difference that could not be picked out by the narrow categorisation
model described so far.

Acquiring Asocial Categories Asocially, by Sensorimotor Toil

Let me return to the narrow model with the reminder that it was to be
a hybrid symbolic/dynamic model. So far, we have considered only
its dynamic properties: It can sort its proximal projections based on error-correcting
feedback. But, apart from being capable of instrumental sensorimotor interactions
such as eating, fleeing, mating or manipulating on the basis of invariants
it has learned by trial and error to detect in its proximal projections,
such a system could, as I had noted, simply assign a unique arbitrary name
on the basis of those same proximal invariants.

What would the name refer to? Again, it would be an approximate, provisional
subset of proximal projections that the system had learnt to detect as
being members of the same category on the basis of sensorimotor interaction,
guided by error-correcting feedback. What was the source of the feedback?
Let us quickly set aside (contra Wittgenstein 1953) the idea that acquiring
such a "private lexicon" would require interactions with anything other
than objects such as mushrooms and toadstools: In principle, no verbal,
social community of such hybrid systems is needed for it to produce a lexicon
of arbitrary category names, and reliably use them to refer to the distal
objects subtended by some proximal projections and not others (cf. Steels
and Kaplan 1999; Steels 2001); there would merely be little point in doing
so asocially. For it is not clear what benefit would be conferred by such
a redundant repertoire of names (except possibly as a mnemonic rehearsal
aid in learning), because presumably it is not the arbitrary response of
naming but the instrumental response of eating, avoiding, etc., that would
"matter" to such a system -- or rather, such an asocial system would function
adaptively when it ate the right thing, not when it assigned it the right
arbitrary name.

But if the system should happen to be not the only one of its kind,
then in principle a new adaptive road is opened for it and its kind, one
that saves them all a lot of honest toil: the road of theft, linguistic
theft.

The Advantages of Acquiring Categories by Symbolic Theft

At this point some of the inadvertent connotations of my theft/toil
metaphor threaten to obscure the concept the metaphor was meant to highlight:
Acquiring categories by honest toil is doing it the hard way, by trial
and error, which is time-consuming and sometimes perhaps too slow and risky.
Getting them any "other" way is getting them by theft, because you do not
expend the honest toil.

This is transparent in the case of Darwinian theft (which is perhaps
better described as "inherited wealth"). In the case of symbolic theft,
where someone else who has earned the category by sensorimotor toil simply
tells, you what's what, "theft" is also not such an apt metaphor,
for this seems to be a victimless crime: Whoever tells you has saved you
a lot of work, but he himself has not really lost anything in so doing;
so you really haven't stolen anything from him. "Gift," "barter," or "reciprocal
altruism" might be better images, but we are getting lost in the irrelevant
details of the trope here. The essential point is that categories can be
acquired by "nontoil" through the receipt of verbal information (hearsay)
as long as the symbols in the verbal message are already grounded
(either by sensorimotor toil or, recursively, by prior grounded verbal
messages) -- and of course as long as there is someone around who has the
category already and is able and willing to share it with you.

Nonvanishing Intersections

Language, according to the model being described here, is a means of
acquiring categories by theft instead of honest toil. I will illustrate
with some examples I have used before (Harnad 1987, 1996a), by way of a
reply to a philosophical objection to sensorimotor grounding models, the
"vanishing intersections" objection: Meaning cannot be grounded in shared
sensorimotor invariants, because the intersection of the sensorimotor projections
of many concrete categories and all abstract categories is empty: Perhaps
the sensorimotor projections of all "triangles" and all "red things" have
something in common -- though one wonders about triangles at peculiar angles,
say, perpendicular to the viewer, reducing them to a line, or red things
under peculiar lighting or contrast conditions where the reflected light
is not in the red spectral range -- but surely the intersections of the
sensorimotor projections of all "plane geometric shapes" vanish, as do
those of "coloured things," or "chairs," "tables," "birds," "bears," or
'games," not to mention the sensorimotor projections of all that is "good,"
or "true" or "beautiful"!

By way of response I make two suggestions:

(1) Successful Sorting Capacity Must Be Based on Detectable
Invariance.

The theorist who wishes to explain organisms' empirical success in sorting
sensorimotor projections by means other than a detectable invariance shared
by those projections (an invariance that of course need not be positive,
monadic and conjunctive, but could also be negative, disjunctive, Plyadic
conditional, probabilitstic, constructive -- i.e., the result of any operation
performed on the projection, including invariance under a projective transformation
or under a change in relative luminance -- indeed, any complex boolean
operation) has his work cut out for him if he wishes to avoid recourse
to miracles, something a roboticist certainly cannot afford to do. Darwinian
theft (innateness) is no help here, as Darwinian theft is as dependent
on nonvanishing sensorimotor intersections as life-time toil is. It seems
a reasonable methodological assumption that if the projections can be successfully
partitioned, then an objective, nonvanishing basis for that success must
be contained within them.

(2) The Invariance Can Be Learned Via Experience or Via Hearsay.

This is also the point at which linguistic theft comes into its own. Consider
first the mushroom/toadstool problem: In a mushroom world I could earn
these two important survival categories the hard way, through honest toil,
sampling the sensorimotor projections and trying to sort them based on
feedback from sometimes getting sick and sometimes getting nourished (Cangelosi
& Harnad 2000). Assuming the problem is soluble (i.e., that the projections
are successfully sortable), then if I have the requisite learning capacity,
and there is enough time in the day, and I don't kill myself or die of
hunger first, I will sooner or later get it right, and the basis of my
success will be some sort of invariance in the projections that some internal
mechanism of mine has laboriously learned to detect. Let's simplify and
say that the invariant is the boolean rule "if it's white and has red spots,
it's a toxic toadstool, otherwise it's an edible mushroom."

Life is short, and clearly, if you knew that rule, you could have saved
me a lot of toil and risk if you simply told me that that was the
invariant: A "toadstool" is a "mushroom" that is "white" with "red spots."
Of course, in order to be able to benefit from such a symbolic windfall,
I would have had to know what "mushroom" and "white" and "red" and "spots,"
were, but that's no problem: symbolic theft is recursive, but not infinitely
regressive: Ultimately, the vocabulary of theft must be grounded directly
in honest toil (and/or Darwinian theft); as mentioned earlier, it cannot
be symbolic theft all the way down.

So far we have not addressed the vanishing-intersections problem, for
I pre-emptively chose a concrete sensory case in which the intersection
was stipulated to be nonvanishing. Based on (1), above, we can also add
that empirical success in sensorimotor categorisation is already a posteriori
evidence that a nonvanishing intersection must exist. So it is probably
reasonable to assume that a repertoire of unproblematic concrete categories
like "red" and "spotted" exists. The question is about the more problematic
cases, like chairs, bears, games, and goodness. What could the sensorimotor
projections of all the members of each of these categories possibly share,
even in a Boolean sense?

The "Peekaboo Unicorn"

By way of a reply, I will pick an even more difficult case, one in which
it is not only their intersection that fails to exist, but the sensorimotor
projections themselves. Enter my "Peekaboo Unicorn": A Peekaboo Unicorn
is a Unicorn, which is to say, it is a horse with a single horn, but it
has the peculiar property that it vanishes without a trace whenever senses
or measuring instruments are trained on it. So, not just in practice, but
in principle, you can never see it; it has no sensorimotor projections.
Is "Peekaboo Unicorn" therefore a meaningless term?

I want to suggest that not only is Pekaboo Unicorn perfectly meaningful,
but it is meaningful in exactly the same sense that "a toadstool is a white
mushroom with red spots" or "a Zebra is a horse with black and white stripes"
are meaningful. Via those sentences you could learn what "toadstool" and
"zebra" meant without having to find out the hard way -- though you could
certainly have done it the hard way in principle. With the Peekaboo Unicorn,
you likewise learn what the term means without having to find out the hard
way, except that you couldn't have found out the hard way; you had to have
the stolen dope directly from God, so to speak.

Needless to say, most linguistic theft is non-oracular (though it certainly
opens the possibility of getting meaningful, unverifiable categories by
divine revelation alone); all it requires is that the terms in which it
is expressed should themselves be grounded, either directly by honest toil
(or Darwinian theft), or indirectly, by symbolic theft, whose own terms
are grounded either by... etc.; but ultimately it must all devolve on terms
grounded directly in toil (or Darwin).

In particular, with the Peekaboo Unicorn, it is "horse," "horn," sense,"
"measuring instrument" and "vanish," that must be grounded. Given that,
you would be as well armed with the description of the Peekaboo Unicorn
as you would be with the description of the toadstool or the zebra to correctly
categorise the first sensorimotor projection of one that you ever encountered
-- except that in this case the sensorimotor projection will never be encountered
because it does not exist (a "quark" or "superstring" might be examples
of unobservable-in-principle things that do exist). I hope it is
transparent that the ontic issue about existence is completely irrelevant
to the cognitive issue of the meaningfulness of the term. (So much for
wide meaning, one is tempted to say, on the strength of this observation
alone.)

Meaning or Grounding?

At this point I should probably confess, however, that I don't believe
that I have really provided a theory of meaning here at all, but
merely a theory of symbol grounding, a theory of what a robot needs in
order to categorise its sensorimotor projections, either as a result of
direct trial and error learning or as a result of having received a string
of grounded symbols: A symbol is grounded if the robot can pick out which
category of sensorimotor projections it refers to. Grounding requires an
internal mechanism that can learn by both sensorimotor toil and symbolic
theft. I and others have taken some steps toward proposing hybrid symbolic/nonsymbolic
models for toy bits of such a capacity (Harnad et al. 1991; Tijsseling
& Harnad 1997; Cangelosi, Greco & Harnad 2000), but the real challenge
is of course to scale up the sensorimotor and symbolic capacity to Turing-Test
scale (Harnad 2000d).

Imagine a robot that can do these things indistinguishably from the
way we do. What might such a robot be missing? This is the difference between
grounding and meaning, and it brings us back to a point I deferred earlier
in my paper, the point about what I have in mind when I mean something.

Mind and Meaning

According to the usual way that the problem of mind and the problem
of meaning are treated in contemporary philosophy, there are not one but
two things we can wonder about with respect to that Turing-Scale Robot:

(i) Is it conscious? Is there anything it feels like to be that
robot? Is there someone home in there? Or does it merely behave exactly
as-if it were conscious, but it's all just our fantasy, with no one home
in there?

(ii) Do its internal symbols or states mean anything? Or are they merely
systematically interpretable exactly as-if they meant something, but it's
all just our fantasy, with no meaning in there?

For (ii), an intermediate case has been pointed out: Books, and even
computerised encyclopedias, are like Turing Robots in that their symbols
are systematically interpretable as meaning this and that, and they can
bear the weight of such a systematic interpretation (which is not a trivial
cryptographic feat). Yet we don't want to say that books and computerised
enyclopedias mean something in the sense that our thoughts mean something,
because the meanings of symbols in books is clearly parasitic on our thoughts;
they are mediated by an external interpretation in the head of a thinker.
Hence, on pain of infinite regress, they are not a viable account of the
meaning in the head of the thinker.

But what is the difference between a Turing Robot and a computerised
encyclopedia? The difference is that the robot, unlike the encyclopedia,
is grounded: The connection between its symbols and what they are interpretable
as being about is not mediated by an external interpreter; it is direct
and autonomous. One need merely step aside, if one is sceptical, and let
the robot interact with the world indistinguishably from us, for a lifetime,
if need be.

Meaning and Feeling

What room is there for further uncertainty? The only thing left, to
my mind, is (i) above, namely, while the robot interacts with the
world, while its symbols connect with what they are grounded in,
does it have anything in mind? There is, in other words, something
it feels like to mean something; a thought means something only
if (a) it is directly grounded in what it is otherwise merely interpretable-by-others
as meaning and (b) it feels like something for the thinker
to think that thought (Harnad 2000, 2001).

If you are not sure, ask yourself these two questions: (i) Would you
still be sceptical about a grounded Turing-scale Robot's meanings if you
were guaranteed that the robot was conscious (feeling)? And (ii) Would
you have any clue as to what the difference might amount to if there were
two Turing Robots, both guaranteed to be unconscious zombies, but
one of them had meaning and grounding, whereas the other had only
grounding. Exercise: flesh out the meaning, if any, of that distinction!
(Explain, by hearsay, what it is that the one has that the other lacks,
if it's neither grounding nor feeling; if you cannot, then the distinction
is just a nominal one, i.e., you have simply chosen to call the same thing
by two diferent names.)

Where does this leave our narrow theory of meaning? The onus is on the
Turing Robot that is capable of toil and theft indistinguishable from our
own. As with us, language gives it the power to steal categories far beyond
the temporal and spatial scope of its sensorimotor experience. That is
what I would argue is the functional value of language in both of us. Moreover,
it relies entirely on its proximal projections, and the mechanisms in between
them for grounding. Its categories are all provisional and approximate;
ontic considerations and distal connections do not figure in them, at least
not for the roboticist. And whether it has mere grounding or full-blown
meaning is a question to which only the robot can know the answer -- but
unfortunately that's the one primal sensorimotor category that we
cannot pilfer with symbols (Harnad 1991).

Harnad, S. (1991) Other bodies, Other minds: A machine incarnation of
an old philosophical problem. Minds and Machines 1: 43-54. http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad91.otherminds.html

1 There are no doubt intermediate cases, in which 'prepared' category
detectors must first be "activated" through some early sensorimotor exposure
to and interaction with their biologically "expected" members (Held and
Hein 1963) before we can help ourselves to the category they pick out.
There are also ways of quantifying how interconfusable the initial sensorimotor
projections are, and how much toil it takes to resolve the confusion. For
the most part, however, the distinction between prepared categories (Darwinian
theft) and unprepared ones learned by trial and error (honest toil) can
be demonstrated empirically.

2 Nets are also adept at unsupervised learning, in which there is no
external feedback indicating which sensorimotor projections belong in the
same category; in such cases, successful sorting can only arise from the
pattern of structural similarities among the sensorimotor projections themselves
-- the natural sensorimotor landscape, so to speak. Some of this may be
based on proximal boundaries already created by Darwinian theft (inborn
feature detectors that already sort projections in a certain way), but
others may be based on natural gaps between the shapes of distal objects,
as reflected in their proximal projections: sensorimotor variation is not
continuous. We do not encounter a continuum of intermediate forms between
the shape of a camel and the shape of a giraffe. This already reduces some
of the blooming, buzzing confusion we must contend with, and unsupervised
nets are particularly well suited to capitalising upon it, by heightening
contrasts and widening gaps in the landscape through techniques such as
competitive learning and lateral inhibition. Perhaps the basis for this
should be dubbed 'cosmological theft' (Harnad 1976). There are also structural
similarity gradients to which unsupervised nets are responsive, but, being
continuous, such gradients can only be made categorical by enhancing slight
disparities within them. This too amounts to either Darwinian or cosmological
theft. In contrast, honest toil is supervised by the error-correcting feedback
arising from the consequences to the system having miscategorized something,
rather than from the a priori structure of the proximal sensorimotor projection.