ABSTRACT: Turing's celebrated 1950
paper proposes a very general methodological criterion for modelling mental
function: total functional equivalence and indistinguishability. His criterion
gives rise to a hierarchy of Turing Tests, from subtotal ("toy") fragments
of our functions (t1), to total symbolic (pen-pal) function (T2 -- the
standard Turing Test), to total external sensorimotor (robotic) function
(T3), to total internal microfunction (T4), to total indistinguishability
in every empirically discernible respect (T5). This is a "reverse-engineering"
hierarchy of (decreasing) empirical underdetermination of the theory by
the data. Level t1 is clearly too underdetermined, T2 is vulnerable to
a counterexample (Searle's Chinese Room Argument), and T4 and T5 are arbitrarily
overdetermined. Hence T3 is the appropriate target level for cognitive
science. When it is reached, however, there will still remain more unanswerable
questions than when Physics reaches its Grand Unified Theory of Everything
(GUTE), because of the mind/body problem and the other-minds problem, both
of which are inherent in this empirical domain, even though Turing hardly
mentions them.

William Wimsatt (1954) originally proposed the notion of the "Intentional
Fallacy" with poetry in mind, but his intended message applies equally
to all forms of intended interpretation. It is a fallacy, according to
Wimsatt, to see or seek the meaning of a poem exclusively or even primarily
in the intentions of its author: The author figures causally in the text
he created, to be sure, but his intended meaning is not the sole or final
arbiter of the text's meaning. The author may have been hewing to his Muse,
which may have planted meanings in the text that he did not even notice,
let alone intend, yet there they are.

A good deal of even our most prosaic discourse has this somnambulistic
character if you look at it sufficiently closely, and the greater the creative
leap it takes, the less aware we are of where it came from (Hadamard 1949).
Even the banal act of pushing or not pushing a button at one's whim comes
under a fog of indeterminacy when scrutinised microscopically (Harnad 1982b;
Dennett & Kinsbourne 1995).

This paper is accordingly not about what Turing (1950)
may or may not have actually thought or intended. It is about the implications
of his paper for empirical research on minds and machines. So if there
is textual or other evidence that these implications are at odds with what
Turing actually had in mind, I can only echo the last line of Max Black's
(1952) doubly apposite essay on the "Identity of Indiscernibles," in which
the two opposing metaphysical viewpoints (about whether or not two things
that there is no way to tell apart are in reality one and the same thing)
are presented in the form of a dialogue between two interlocutors. Black
effects to be even-handed, but it is obvious that he favours one of the
two. The penultimate line, from the unfavoured one, is something like "Well,
I am still not convinced," the last line, from the favoured one, "Well,
you ought to be."

Turing is usually taken to be making a point about "thinking" (Dennett
1985; Davidson 1990). Using the example of a party game in which one must
guess which of two out-of-sight candidates is male and which is female
on the basis of written messages alone (today we would say on the basis
of email-interaction alone), Turing asks what would happen if the two candidates
were instead a man and a machine (let us say a digital computer, for the
time being) and we could not tell them apart. For then it would follow
that we could have no grounds for holding that one of them was thinking
and the other was not. Hence either the question "What is thinking?" is
meaningless, or it is provisionally answered by whatever proves successful
in passing such a test.

This is the gist of the "thought experiment" Turing invented in that
paper, but it clearly needs some more elaboration and analysis, for otherwise
it is open to misconstrual.

One misconstrual is that the outcome of such a "Turing Test" would accordingly
just be a trick: The successful machine candidate would not really be thinking
at all, and Turing has simply shown that it might be possible to fool us
in this way with a computer.

I don't think this is a correct interpretation of Turing's thought experiment.
Turing's point is much more substantive and positive than this. If there
is any implication about trickery, it is that we are fooling ourselves
if we think we have a more sensitive basis than the Turing Test (TT) for
determining whether or not a candidate is thinking: All functional tests
of thinking or intelligence are really just TTs (Harnad 1992b).

I will return to this point when I introduce the TT Hierarchy. For now,
note simply that the notion that "I have a better way" of discerning whether
a candidate is really thinking calls for a revelation of what that better
way is, relative to which the TT is merely a trick. It turns out that all
the "better ways" are either arbitrary or subsumed by a sufficiently general
construal of what the TT itself is. (Again, it matters little whether or
not Turing explicitly contemplated or intended this generalisation; it
is implicit in his argument.)

A second misconstrual of Turing's argument is that he somehow proved
that any candidate passing the TT must be thinking/intelligent: He certainly
did not prove that (nor is it amenable to proof, nor is it true of necessity),
and indeed there now exists a prominent counterexample -- not a proof either,
but a counterargument as strong as Turing's original argument -- for one
specific kind of candidate passing one specific version of the TT, namely,
a pure symbol-system passing the purely symbolic version of the TT. The
counterargument in question is Searle's (1980)
notorious "Chinese Room Argument," and I will return to it shortly (Harnad
1989).
Suffice it to say that Turing's is an epistemic point, not an ontic one,
and heuristic rather than demonstrative.

4. Equivocation About Thinking and Machines

So if passing the TT is neither evidence of trickery nor a guarantee
of intelligence, what is it?

To answer this, one must say much more explicitly what Turing's Argument
shows, and how. To do this, we have to get rid of the vexed word "thinking,"
which Turing in any case wanted to conclude was equivocal, if not incoherent.
It is equivocal in that the very distinction between "real" and "artificial"
here is equivocal. In a trivial sense, any natural thing that we succeed
in synthesising is not "real," precisely in that it is not natural but
synthetic, i.e., man-made (Harnad 1993b).
So the question of what thinking is, and whether a man-made machine can
have "real" thoughts would be trivially answered if "real" simply means
"natural" (i.e., not man-made, not "synthetic"), for then nothing synthetic
is real.

So if the real/artificial distinction is not just the natural/synthetic
distinction, what is it then? Even the word "machine" is equivocal in exactly
the same sense, for what is a machine? Is it a synthetic, man-made mechanism?
Or is it any physical/causal mechanism at all? If cars or planes or robots
grew on trees, would that alone make them no longer machines? And if we
bioengineered an organic organism from scratch, molecule by molecule, would
that make it more a machine than its natural counterpart already is?

No. These distinctions are clearly too arbitrary. The deep issues here
are not about the natural/man-made distinction but about something else,
something going much deeper into the nature of mechanism itself,
irrespective of its provenance.

5. Functional Capacity: Mindful vs. Mindless

It is easy to give a performance-based or functional criterion for "intelligence"
(precisely the same point can be made about "thinking"): Intelligence is
as intelligence does. It requires intelligence to factor a quadratic equation.
Hence, any system that can do that, has, eo ipso, intelligence. But this
is clearly not what we mean by intelligence here, and defining it thus
does not answer any deep questions about what might be present or lacking
in a system along these lines.

What are these lines, then? I think it has become abundantly clear that
what we are really wondering about here is whether or not a system has
a mind: If it does intelligent, thoughtful, mind-like things, but does
so mindlessly, because there is nobody home in there, no one experiencing
experiences, feeling feelings, thinking thoughts, then it falls on one
side of the dividing line; if it does have a mind, be that mind ever so
limited, "thoughtless" and "unintelligent," it is on the other side. And
the natural fault marked by this dividing line, coincides, for better or
worse, with the mind/body problem.

6. Turing Indistinguishability and the Other-Minds Problem

So Turing is right that we should set aside the equivocal notion of
"thinking/intelligence," because nothing nonarbitrary hinges on it. What
we really want to know about the candidate is whether or not it has a mind
-- and this clearly cuts across the natural/synthetic distinction, for
if there is no way to determine whether or not a system has a mind on the
basis of its provenance (natural or man-made), then in learning that a
TT-passing candidate is a "machine," we have not really learned anything
at all (insofar as the question of whether it has a mind is concerned).

This is merely a special case of an even more general indeterminacy,
namely, the fact that no empirical test, TT-passing or otherwise, can tell
us whether or not a candidate has a mind! This is one of the lemmas we
inherit with the mind/body problem, and it is called the "other-minds problem"
(Harnad 1991).
(It would have cut short a lot of potential misconstruals of Turing's paper
if he had simply called a spade a spade here, but there we are; Turing
was not a philosopher but a mathematician.)

Turing did allude to the other-minds problem in passing, but only to
dismiss it as leading to "solipsism" and hence the end of rational discourse
("It may be the most logical view to hold but it makes communication of
ideas difficult"). Yet the other-minds problem is an inescapable methodological
constraint on mind-modelling (Harnad 1984, 1985). For the only way to know
for sure that a system has a mind is to be that system. All other ways
are risky, and risky in a way that is much more radical than the ordinary
Humean risk that goes with all empirical inferences and generalisations.
I am not referring here to philosophical scepticism about whether someone
as like me as my neighbour has a mind; let us take that for granted. But
what happens as we move away in likeness, toward other mammalian species,
invertebrates, micro-organisms, plants, man-made artifacts? Where does
mere philosophical scepticism leave off and real empirical uncertainty
kick in?

7. Out of Sight, Out of Mind: Dissociating Structure and Function

Turing implicitly partitioned this question. "Likeness" can take two
forms: likeness in structure and likeness in function. In sending the candidates
out of the room in the party game, Turing separated what he took to be
irrelevant and potentially biasing aspects of the superficial appearance
of intelligence from pertinent manifestations of the functional capacities
that constitute it. But banishing physical structure from sight, Turing
also inadvertently excluded some potentially relevant aspects of function,
both internal and external to the candidate.

For human minds, at least, can do a lot more than just exchange words.
And as the TT is clearly predicated on functional likeness (indeed, functional
indistinguishability), to banish all our nonverbal functions from it would
be to limit the TT's domain of likeness in an almost arbitrary way, and
certainly in a way that goes against the spirit of a test that is predicated
on functional indistinguishability (Turing-indistinguishability).

8. The Expressive Power of Natural Language and Formal Computation

The special case of only words-in and words-out is not entirely arbitrary,
however, because of the remarkable expressive power of language, which
some have even suggested might be limitless and universal, including, as
it does, not only the capacity of words to express any possible proposition
[Steklis & Harnad 1976; Harnad 1996],
but also the full computational power of the Turing Machine [Harnad 1982a,
1994b].
Nevertheless, there are things that human beings can do that go beyond
mere verbalising; and perhaps more important, these nonvocal capacities
may functionally precede and underlie our capacity to have and exchange
words:
We are able, after all, to interact nonverbally with that world of objects,
events, and states that our words are about (e.g., we can detect, discriminate,
identify, and manipulate them). Indeed, it is hard to imagine how our words
could have the meanings they have if they were not first grounded in these
nonverbal interactions with the world (Harnad 1990a,
1992a,
1993a;
Crockett 1994).

9. The Turing Hierarchy: Level t1 (Toy Models)

It is accordingly time to introduce the TT hierarchy (Harnad 1994a).
The lowest level would properly be T1, but there is no T1, only t1, for
here "t" stands not for "Turing" (or "Total"), but for "toy," and t1 launches
the TT hierarchy without yet being a proper part of it: The t1 level of
Turing modelling and Turing Testing is the level of "toy models." These
are models for subtotal fragments of our functional capacity. At worst,
these fragments are just arbitrary ones; at best, some might be self-sufficient
modules, eventual components of our full-scale capacity, but not stand-alone
candidates in the Turing sense. For the TT is predicated on total functional
indistinguishability, and toys are most decidedly distinguishable from
the real thing. (Toys aren't Us.)

Before moving up to T2 we will make it explicit exactly why toys are
ultimately inadequate for the goals of Turing Testing. But note that all
of the actual mind-modelling research efforts to date are still only at
the t1 level, and will continue to be so for the foreseeable future: Cognitive
Science has not even entered the TT hierarchy yet.

10. Level T5 (Grand Unified Theories of Everything [GUTEs])

Toy models, that is, subtotal models, of anything at all, have the liability
that they have more degrees of freedom than the real, total thing that
they are modelling. There are always more different ways to generate a
fragment of a system's function (say, chess-playing -- or simple harmonic
motion) than there are to generate the system's total function (say, everything
else a human mind can do -- or all of Newtonian or Quantum Mechanics),
if for no other reason than that the Total model must subsume all the functions
of the subtotal toy models too (Schweizer 1998; cf. Lucas 1961).

We spoke of Humean indeterminacy earlier; philosophers are not yet of
one mind on the number of candidate Total Theories there are likely to
be, at the end of the empirical/theoretical road for, say, Physics. At
that Utopian endpoint, when all the data -- past, present and future --
are Totally accounted for, will there be one and only one Grand Unified
Theory of Everything (GUTE)? Or will there be several rival GUTEs, each
able to account for everything? If the latter, will the rivals posit an
equal number of parameters? and can we assume that the rival GUTEs will
just be notational variants of one another? If not, will we be right to
prefer the miniparametric GUTE, using Occam's Razor?

These are metaphysical questions, and physicists would be more than
happy to have reached that Utopian stage where these were the only questions
left to worry about. It is clear, however, that current physics is still
at the stage of subtotal theories (it would be churlish for any other science
to venture to call them "toys," as physics's are surely the most advanced
such toys any science has yet produced). Subtotal models always run the
risk of being very wrong, although that risk diminishes as they scale up
toward a closer and closer approximation to totality (GUTE); the degrees
of freedom shrink, but they may not vanish altogether, even after we have
reached Utopia.

Utopia is also the top Level T5 of the Turing Hierarchy. We will return
to this after describing Levels T2-T4.

11. Utopian Differences That Make No Difference

The Humean risk that our theory is wrong, instead of shrinking to zero
even at totality, may remain in that nonzero state which philosophers call
the "underdetermination" of theory by data. The ever-present possibility
that a theory could be wrong, even after it accounts for all the data,
and that another theory might instead be right, is a liability that science
has learned to live with. By way of consolation, one can note that, at
the point of miniparametric totality -- when all the data have been accurately
predicted and causally explained by the GUTE(s) with the fewest parameters,
the difference between being right and wrong would no longer be one that
could make any palpable difference to any of us.

Consider the status of "quarks," for example. (I am not speaking now
of actual quarks, which have lately become experimentally observable, but
the original notion of a quark, which was a largish entity that could not
occur in "free state," but only tightly "bound" up together with other
quarks into a much tinier observable entity such as an electron.) The existence
of quarks was posited not because they were experimentally observed, but
because hypothesising that they existed, even though they could never be
observed empirically, made it possible to predict and explain otherwise
unpredictable and inexplicable empirical data. But quarks have since turned
out to be observable after all, so I am now speaking about what GUTEs would
have been like if they had contained unobservable-in-principle quarks.
[A more current example might be superstrings.]

At the end of the day, once all data were in and fully accounted for
by GUTEs that posited (unobservable-in-principle) quarks, one might still
wonder about whether quarks really exist. Maybe they do, maybe they don't.
There is no empirical way left to find out. Maybe the GUTEs that posit
quarks are wrong; but then other, quarkless, GUTEs must have components
that do for them the predictive/explanatory work that the quarks do for
the quarkful GUTEs. In any case, it is a fact about the quarkful GUTEs
that without positing quarks they cannot successfully predict/explain Everything.

12. Empirical Underdetermination and Turing Testing

I rehearse this indeterminacy explicitly in order to make it clear how
little is at stake: All data are successfully predicted and explained,
and we are just worrying about which (if any) of the successful Total,
Turing-Indistinguishable explanations is the real one; and no one can ever
be the wiser except an omniscient deity who simply knows the truth (e.g.,
whether or not there really are quarks).

This is normal scientific underdetermination. Notice that the name of
Turing has been introduced into it, because with GUTEs we are actually
at the top (T5) level of the Turing hierarchy, which turns out to be a
hierarchy of (decreasing) degrees of empirical underdetermination. The
t1 (sub-Turing) level, for both physical science (world-modelling) and
cognitive science (mind-modelling), is that of subtotal, toy models, accounting
for some but not all of the data; hence underdetermination is maximal in
toyland, but shrinks as the toys grow.

13. Basic Science Vs. Engineering: Forward and Reverse

Physical science and cognitive science are not really on a par, because
physical systems with minds are of course a part of the world too. Cognitive
science is probably better thought of as a branch of reverse bioengineering
(Dennett 1994): In normal engineering, the principles of physics (and engineering)
are applied to the design of physical systems with certain functions that
are useful to us. In reverse bioengineering, those systems have already
been designed by Nature (by the Blind Watchmaker, Darwinian Evolution),
and our task is to figure out how they work. The actual methodology turns
out to resemble forward engineering, in that we still have to design systems
that are more and more like the natural ones, starting with toy models,
and converging on Total Turing-Indistinguishability by a series of tighter
and tighter approximations.

Which brings us back to the Turing Hierarchy. Before we leave the t1
level, let it be stressed that, in principle, there can be a distinct Turing
Hierarchy for each species' functional capacities; hence there can be toy
models of mollusk, mouse and monkey, as well as man. Other species no doubt
have minds (I personally think all animals with nervous tissue, vertebrate
and invertebrate, do, and I am accordingly a vegetarian), but, as mentioned
earlier, our confidence in this can only diminish as we move further and
further from our own species. So it is not arbitrary that Turing Testing
should end at home (in the total performance capacities of our own species),
even if the toy modelling begins with the reverse engineering of "lesser"
species (Harnad 1994a).

So at the bottom level of the hierarchy are subtotal toy models of subhuman
and human functional capacities. Such toy models are also sub-Turing (which
is why this level is called t1 instead of T1), because the essential criterion
for Turing Testing is Total Indistinguishability from the real thing, and
toys are by definition subtotal, hence distinguishable. Nevertheless, the
idea of converging on Totality by closer and closer approximations is implicit
in Turing's paper, and it is the basis for his sanguine predictions about
what computers would successfully scale up to doing by our day. (Unfortunately,
his chronology was too optimistic, and this might have been because he
was betting on a particular kind of candidate, a purely computational one;
but if so, then he shouldn't have been, for the Turing spectrum admits
many other kinds of candidate mechanisms too.)

14. Level T2: The Pen-Pal Turing Test (Symbols-in/Symbols-out)

T2 is the conventional TT, the one I've taken to calling the "pen-pal"
version: words in and words out (Harnad 1989).
It is the one that most people have in mind when they think of Turing Testing.
But because of the unfortunate "imitation game" example with which it was
introduced, T2 has been subject to a number of misunderstandings, which
I will quickly try to correct here:

14.1. Life-Long: T2 is not a game (or if it is, it's no less
than the game of life!). It is a mistake to think of T2 as something that
can be "passed" in a single evening or even during an annual Loebner Prize
Competition (Loebner 1994;
Shieber 1994).
Although it was introduced in the form of a party game in order to engage
our intuitions, it is obvious that Turing intends T2 as a serious, long-term
operational criterion for what would now be called "cognitive science.
The successful candidate is not one that has fooled the judges in one session
into thinking it could perform indistinguishably from a real pen-pal with
a mind. (Fooling 70% of the people one time is as meaningless, scientifically,
as fooling 100% of the people 70 times.) The winning candidate will really
have the capacity to perform indistinguishably from a real pen-pal with
a mind -- for a lifetime, if need be, just as unhaltingly as any of the
rest of us can. (This is a variant of the misconstrual I mentioned earlier:
T2 is not a trick; Harnad 1992b.)

14.2. Symbolic/Nonsymbolic: T2 is restricted to verbal (i.e.,
symbolic) interactions: symbols in, symbols out. But it is not restricted
in principle to a computer doing only computation (symbol manipulation)
in between (although Turing so restricts it in practise). The candidate
"machine" can in principle be anything man-made, including machines other
than digital computers (e.g., sensorimotor transducers, analog computers,
parallel distributed networks, indeed any dynamical system at all; Harnad
1993a).

14.3. Man-Made: The reason the candidate needs to be man-made
is that all the tests in the TT Hierarchy are tests of mind-models. Interposing
a natural candidate that we did not build or understand would not be a
test of anything at all. Turing was indifferent about whether the thinking
thereby generated by the system was real or not -- he thought that that
was an uninteresting question -- but he did want to successfully synthesize
the capacity, in toto. So we could say that he did not care whether the
result would be the reverse bioengineering of the real mind, or the forward
engineering of something mind-like but mindless, but the engineering was
certainly intended to be real, and to really deliver the behavioral goods!

14.4. Mindful: Turing was not interested in the question of whether
the candidate would really have a mind, but that does not mean that the
question is of no interest. More important, he ought to have been
interested, because, logically speaking, without the mental criterion,
T2 is not a test of anything: it is merely an open-ended catalogue of what
a particular candidate can do. The indistinguishability is between the
candidate and a real person with a mind. So, like it or not, if T2 is not
merely to be testing the difference between two arbitrary forms of cleverness,
it must be testing mindful cleverness against mindless. And once they are
indistinguishable, then we no longer have a basis for affirming of the
one a predicate we wish to deny of the other (other than trivial predicates
such as that one is man-made and the other is not). And "mindful/mindless"
is just such a (nontrivial) predicate.

14.5. Conscious: Note that T2 is rightly indifferent to whether
the candidates are actually doing what they do consciously (Harnad 1982b;
Searle 1990). They might be doing it somnambulistically, as discussed earlier
in connection with the intentional fallacy. This does not matter: the real
person has a mind, even if it's only idling in the background, and that
would be a world of difference from a candidate that had no mind at all,
foregrounded or backgrounded, at that moment or any other. And if they're
indistinguishable (for a lifetime), then they're indistinguishable; so
if you are prepared to impute a mind to the one, you have no nonarbitrary
basis for denying it to the other.

14.6. Empirical/Intuitive: Turing Indistinguishability is actually
two criteria, one empirical and one intuitive: The empirical criterion
is that the candidate really has to have total T2 capacity, for a lifetime.
The intuitive criterion is that that capacity must be totally indistinguishable
from
that of real people with minds to real people with minds. (Note
the plural: not only is T2 not a one-night affair; it is not merely a dyadic
one either. Real pen-pals have the capacity to correspond with many people,
not just one.) The intuitive criterion may be almost as important as the
empirical one, for although (because of the other-minds problem) we are
not really mind-readers, our intuitive "theory-of-mind" capacities may be
the next best thing (Heyes 1998,
Mitchell & Anderson 1998), picking up subtleties that are dead give-aways
to our fine-grained intuitions while still out of the reach of the coarse-grained
reverse-engineering of our overall cognitive capacities.

15. Searle's Chinese Room Argument

Only corrective (14.2) above is unique to T2. The rest are applicable
to all the levels of the TT Hierarchy. T2 is a special level not just because
it is the one at which Turing originally formulated the TT, and not just
because of the special powers of both natural language and formal computation
on which it draws, but because it is the only level that is vulnerable
to a refutation, in the form of a counterexample.

So let us now briefly consider this celebrated counterexample to T2,
for it will serve to motivate T3 and T4: But first, note that the intentional
fallacy applies to Searle's (1980)
paper as surely as it applies to Turing's (1950)
paper: Searle thought he was refuting not only cognitive "computationalism"
(which he called "Strong AI" and will be defined in a moment), but also
the Turing Test. In fact, he was refuting only T2, and only in the special
case of one specific kind of candidate, a purely computational one: an
implementation-independent implementation of a formal symbol system. T2,
however, is not the only possible form a TT can take, and a pure symbol
system is certainly not the only possible candidate.

Searle (1980,
1984, 1993) defined his target, "Strong AI," as the hypothesis expressed
by the following three propositions:

(1) The mind is a computer program.

(2) The brain is irrelevant.

(3) The Turing Test is decisive.

This formulation was unfortunate, because it led to many misunderstandings
and misconstruals of Searle's otherwise valid argument that I do not have
the space to analyse in detail here (see Harnad 1989;
Hayes et al. 1992;
Harnad, 2000b).
No one holds that the mind is a computer program (with the absurd connotation
that inert code might have feelings). What the proponents of the view that
has since come to be known as (cognitive) "computationalism" hypothesise
is that mental states are really just computational states, which means
that they are the physical implementations of a symbol system -- or code,
running (Newell 1980; Pylyshyn 1980, 1984). This makes more sense.

Similarly, what Searle should have said instead of (2) was that the
physical
details of the implementation are irrelevant -- but without the implication
that implementing the code at all was irrelevant. Hence the rest
of the details about the brain are irrelevant; all that matters is that
it is implementing the right code.

To hold that the capacity to pass the TT (3) is what demonstrates that
the "right" code is being implemented, the one that implements a mind,
is unobjectionable, for that is indeed Turing's argument -- except of course
that Turing was not necessarily a cognitive computationalist! If just a
computer, running the right code, could successfully pass the pen-pal version
of the TT (T2), Turing would indeed have held that we had no better or
worse grounds for concluding that it had a mind than we would have in the
case of a real pen-pal, who really had a mind, for the simple reason that
we had no way to tell them apart -- their performance would be Turing indistinguishable.

However, Turing would have had to concede that real pen-pals can do
many other things besides exchanging letters; and it is hard to see how
a computer alone could do all those other things. So Turing would have
had no problem with an augmented candidate that consisted of more than
just a computer implementing the right code. Turing would have been quite
content, for example, with a robot, interacting with the world Turing indistinguishably
from the rest of us. But that version of the TT would no longer be T2,
the pen-pal version, but T3, the robotic version.

We will return to T3 shortly. It must also be pointed out that Turing
is not irrevocably committed to the candidate's having to be merely a computer
even in the case of T2. The T2-passing system could be something other
than just a computer running the right code; perhaps no computer/code combination
alone could successfully generate our lifelong pen-pal capacity, drawing
as the latter does on so much else that we can all do. We could, for example,
include a photo with one of our letters to a pen-pal, and ask the pen-pal
to comment on the quality of the blue object in the photo (or we could
simply enclose a blue object). A mere computer, without sensorimotor transducers,
would be completely nonplussed by that sort of input. (And the dynamical
system consisting of a computer plus sensorimotor transducers would no
longer be just a computer.)

Never mind. A (hypothetical) T2-passing computer is nevertheless fair
game, in this thought-experimental arena that Turing himself created. So
Searle takes Turing at his word, and supposes that there is a computer
program that is indeed capable of passing T2 (for a lifetime); he merely
adds the stipulation that it does so in Chinese only. In other words, the
candidate could correspond for a lifetime with Chinese pen-pals without
any of them having any reason to suspect that they were not communicating
with a real (Chinese-speaking) person. Turing's argument is that at the
end of the day, when it is revealed that the candidate is a computer, one
has learned nothing at all to over-rule the quite natural inference one
had been making all along -- that one was corresponding with a real pen-pal,
who really understood what all those letters (and his own replies) had
meant.

It is here that Searle makes a very shrewd use of the other two premises
-- that what is at issue is the code alone (1), and that although that
code must be implemented, the physical details of the implementation are
irrelevant (2): each and every implementation of the right code must have
a mind, if any of them does, and if they indeed have a mind only because
they are implementations of the right (i.e., T2-passing) code.

So Searle too makes a simple appeal to our intuitions, just as Turing
did in appealing to the power of indistinguishability (i.e., in noting
that if you can't tell them apart, you can't affirm of one what you deny
of the other): Searle notes that, as the implementation is irrelevant,
there is no reason why he himself should not implement the code. So we
are to suppose that Searle himself is executing all the computations on
the input symbols in the Chinese messages -- all the algorithms and symbol
manipulations -- that successfully generate the output symbols of the pen-pal's
replies (the ones that are indistinguishable from those of real life-long
pen-pals to real life-long pen-pals).

Searle then simply points out to us that he would not be understanding
the Chinese messages he was receiving and generating under these conditions
(any more than would a student mindlessly but successfully going through
the motions of executing some formal algorithm whose meaning he did not
understand). Hence there is no reason to believe that the computer, implementing
exactly the same code -- or indeed that any implementation of that code,
merely in virtue of implementing that code -- would be understanding either.

What has happened here? Turing's "no way to be any the wiser about the
mind" argument rested on indistinguishability, which in turn rested on
the "other-minds" barrier: The only way to know whether any candidate other
than myself has or hasn't a mind is to be that candidate; but the
only one I can be is me, so Turing indistinguishability among candidates
with minds is all I have left.

What Searle has done is to exploit one infinitesimally narrow window
of vulnerability in cognitive computationalism and T2: No one can show
that a computer (or a rock) has or hasn't a mind, because no one can be
that computer or that rock. But the computationalism thesis was not about
the computer, it was about the running code, which was stipulated to be
implementation-independent. Hence the details of how the code is executed
are irrelevant; all that is needed is that the code be executed. And it
is indeed being executed by Searle, who is thereby being the running
code, being the candidate (there is no one else in sight!). And he (truthfully)
informs us that he does not understand a word: he is merely mindlessly
manipulating code.

This counterargument is not a proof (any more than Turing's own indistinguishability
argument is a proof). It is logically possible that simply going through
the motions of manipulating the right symbols does give birth to another
understanding mind -- not Searle's, for he does not understand, but whoever
in his head is answering the Chinese messages in Chinese. (It is, by the
same token, logically possible that if we taught someone illiterate and
enumerate how to apply the symbol-manipulation algorithm for solving quadratic
equations, that he, or someone else inside his head, would thereby come
to understand quadratic equations and their solution.)

17. Multiple Minds? The Searlean Test [ST]

This is spooky stuff. We know about multiple personality disorder --
extra minds cohabiting the same head, not aware of one another -- but that
is normally induced by early sexual abuse rather than by learning to manipulate
mindlessly a bunch of meaningless symbols.

So I think it makes more sense to give up on cognitive computationalism
and concede that either no implemented code alone could ever successfully
pass the lifetime T2, or that any code that did so would nevertheless fail
to have a mind; that it would indeed be just a trick. And the way to test
this would be via the "Searlean Test" (ST), by being the candidate, executing
the code oneself, and failing to experience the mental state (understanding)
that is being attributed to the pen-pal. As one has no intuitive inclination
whatsoever to impute extra mental states to oneself in the face of clear
1st-person evidence that one lacks them, the ST should, by the transitivity
of implementation-independence, rule out all purely computational T2-passers.

A purely computational T2-passer, in other words, is Turing-distinguishable
from a candidate with a mind, and the way to make the distinction is to
be the candidate, via the ST [Footnote 1].

18. Grounding Symbols in Their Objects Rather than in the Mind of
an External Interpreter

Has Turing lost a lot of territory with this? I don't think so. The
other-minds barrier is still there to ensure that no other candidate can
be penetrated and hence unmasked in this way. And besides, there are stronger
and more direct reasons than the ST for having doubts about cognitive computationalism
(Harnad 1990a):
The symbols in a symbol system need to be grounded in something other than
simply more symbols and the interpretations that our own minds project
onto them (as was the case with the symbolic pen-pal), no matter how coherent,
self-confirming, and intuitively persuasive those external interpretations
might be (Harnad 1990b,
1990c).
The symbols must be grounded directly and autonomously in causal interactions
with the objects, events and states that they are about, and a pure symbol-cruncher
does not have the wherewithal for that (as the photo/object accompanying
the letter to the pen-pal demonstrates).

What is needed to ground symbols directly and autonomously in the world
of objects? At the very least, sensorimotor transducers and effectors (and
probably many other dynamical internal mechanisms); and none of these are
implementation-independent; rather, they are, like most things, other-minds-impenetrable,
and hence immune to the ST and the Chinese Room Argument. The TT candidate,
in other words, cannot be just a disembodied symbol-cruncher; it must be
an embodied robot, capable of nonsymbolic, sensorimotor interactions with
the world in addition to symbolic ones, all Turing-indistinguishable from
our own. This is T3, and it subsumes T2, for pen-pal capacities are merely
subsets of robotic capacities (Harnad 1995a).

19. Level T3: The Robotic Turing Test

Note that just as scaling up from toys to totality is dictated by Turing's
functional equivalence/indistinguishability constraint, so is scaling up
from T2 to T3. For pen-pal capacities alone are clearly subtotal ones,
hence distinguishable as such. More important, pen-pal capacities themselves
surely draw implicitly on robotic capacities, even in the T2, as demonstrated
not only by enclosing any nonsymbolic object with a letter for comment,
but surely (upon a little reflection) also by the very demands of coherent,
continuous discourse about real objects, events, and states via symbols
alone (French 1990; Harnad 1992a;
Crockett 1994, Watt 1996).

Why did Turing not consider T3 more explicitly? I think it was because
he was (rightly) concerned about factoring out irrelevant matters of appearance.
Differences in appearance certainly do not in and of themselves attest
to absence of mind: Other terrestrial species surely have minds; extra-terrestrial
creatures, if they exist, might have minds too; and so might robots. But
it's a safe bet that no current-generation robot has a mind, because they
are all still just at the most rudimentary toy level, t1, in their robotic
capacities (Pylyshyn 1987). Perhaps Turing simply thought our other-minds
intuitions would be hopelessly biased by robotic appearance in T3, but
not in T2. In any case, the recent progress in making robots look lifelike,
at least in movies, may outstrip our prejudices. (If anything, the problem
is that today's cinematic robots are getting ahead of the game, being endowed
with fictional T3 capacity while their real-world counterparts are still
functionally closer to mud than even to mollusks.)

20. Level T4: Internal Microfunctional Indistinguishability

Do T2-T3 exhaust the options for mind-modelling? No, for if the operational
constraint is functional equivalence/indistinguishability, there is still
a level of function at which robots are readily distinguishable from us:
the internal, microfunctional level -- and here the structure/function
distinction collapses too. For just as we can ask whether, say, facial
expression (as opposed to facial shape) is not in fact a legitimate component
of our robotic capacity (it is), so we can ask about pupillary dilation
(which allegedly occurs when we find someone attractive, and is Turing-detectable),
blushing, heart-rate, etc. These are all still arguably components of T3.
But, by the same token, we become speechless if kicked on the left side
of our heads, but not the right (Harnad et al. 1977), we exude EEG, we
bleed when we are cut, if we are opened up on an operating table we are
found to be made of flesh, and in our heads one finds gray/white matter
and neuropeptides rather than silicon chips.

These are all Turing-detectable functional differences between ourselves
and T3 robots: Are they relevant? We have already said that the fact that
the candidate is man-made is not relevant. Not relevant to what? To determining
whether or not the candidate has a mind. Do the neurobiological details
matter? Perhaps. We are not, after all, implementation-independent symbol
systems. Perhaps there are implementational details that are essential
to having a mind.

Surely there are. But just as surely, there must be implementational
details that are not essential to having a mind. And the problem
is that there is no way of knowing which are which. Hence brain-scanning
certainly plays no role in our perpetual Turing-testing of one another
(in our age-old, day-to-day coping with the Other-Minds Problem), any more
than mind-reading does (Harnad 1991).

We are now at the level of T4: Here the candidates are not only indistinguishable
from us in their pen-pal and robotic functions, but also in their internal
microfunctions. This is still the Turing hierarchy, for we are still speaking
about functional indistinguishability and about totality. But do we really
have to go this high in the hierarchy in practise? Indeed, is it even consistent
with the spirit and methodology of Turing Testing to insist that the candidate
be T4-indistinguishable from us? I think not, and I will now introduce
my own thought experiment to test your intuitions on this. My example will
be recognisably similar to the earlier discussion of underdetermination
and the Utopian GUTEs in physics.

21. Grand Tournament of T-Levels: T3 vs. T4 vs. T5

Suppose that at the end of the day, when mind-modelling attains its
Utopia, there are nine successful candidates. All nine are T3-indistinguishable
from us, but three of them fail T4 (cut them open and they are clearly
distinguishable in their internal microfunctions from us and our own real
neural functions). Of the remaining six, three pass T4, but fail T5: their
internal neural activities are functionally indistinguishable -- transplants
can be swapped between us and them, at any biocomponential scale -- but
theirs are still crafted out of synthetic materials; they are T4-indistinguishable,
but T5-distinguishable; only the last three are engineered out of real
biological molecules, physically identical to our own, hence T5-indistinguishable
from ourselves in every physical respect. The only remaining difference
among these last three is that they were designed in collaboration with
three different physicists, and each of the physicists happens to subscribe
to a different GUTE -- of which there also happen to be three left at the
end of the day.

Now note that the three T5 candidates are empirically identical in kind,
right down to the last electron. The only difference between them is their
respective designers' GUTES (and recall that each GUTE accounts successfully
for all physical data). Now, according to one of the GUTEs, unobservable-in-principle
quarks exist; according to another, they do not, but their predictive/explanatory
work is done instead by some other unobservable entity. Both these GUTEs
have the same number of parameters. The third GUTE has fewer parameters,
and needs neither quarks nor their counterparts in the second GUTE.

So there we have our nine candidates: three T3's, three T4's and three
T5's. All nine can correspond with you as a pen-pal for a lifetime; all
nine can interact robotically with the people, objects, events and states
in the world indistinguishably from you and me for a lifetime. All nine
cry when you hurt them (whether physically or verbally), although three
of them fail to bleed, and three of them bleed the wrong kind of blood.
For the sake of argument, you have known all nine all your life (and consider
them all close friends), and this state of affairs (that all nine are man-made)
has only just been revealed to you today. You had never suspected it (across
-- let us say, to calibrate our intuitions -- 40 years of friendship and
shared life experiences!).

You are now being consulted as to which of the nine you feel can safely
be deprived of their civil rights as a result of this new revelation. Which
ones should it be, and why?

I think we are still squarely facing the Turing intuition here, and
the overwhelming answer is that none should be deprived of their
civil rights; none can be safely assumed to be "Zombies" (Harnad 1995b).
By all we can know that matters for such moral decisions about people with
minds, both empirically and intuitively, they all have minds, as surely
as any of the rest of us do [Footnote 2].

22. Level T3 Is Decisive

So does that mean that all differences above the level of T3 matter
as little (or as much) to having a mind as the differences between the
three Utopian GUTEs? Intuitively (and morally), I think the answer is undeniably:
Yes. Empirically, it may prove to be a different story, because there may
turn out to be some causal dependencies: Certain microfunctional (T4) or
even microphysical (T5) properties may not prove to be viable in passing
T3 (e.g., it might not be possible to pass T3 with serial processors, or
with components made out of heavy water); other T4 or T5 properties might
give us positive clues about how to pass T3 (excitation/inhibition or 5-hydroxytryptamine
may be essential) -- and any success-enhancing clue is always welcome in
reverse-engineering, as anywhere else! In this way neuroscience and even
fundamental physics could conceivably help us pass T3; but T3 would still
be the final arbiter, not T4 or T5.

The reason is clear, and again it is already implicit in Turing's functional-indistinguishability
criterion: Not only are ordinary people not mind-readers; reverse-engineers
aren't mind-readers either. All are constrained by the other-minds barrier.
Function is the only empirical criterion.

(And structural correlates are only reliable if they are first cross-validated
by function: the reason I trust that one flashing red nucleus in my otherwise
comatose and ostensibly brain-dead patient is evidence that he is still
alive, feeling, and conscious, and will shortly make a come-back, is that
prior patients have indeed made come-backs under those conditions, and
subsequently reported the words they had heard me speak as I contemplated
pulling the plug on their life-support systems at the very moment that
only their red nucleus was still flashing.)

23. 1st-Person and 3rd-Person Facts

Now the forward-engineering of the mind would clearly be only a functional
matter, just as the forward engineering of bridges, cars and space stations
is. Reverse-engineering the mind is a bit more complicated, because we
know one extra fact about the mind, over and above the 3rd-person functional
facts, and that is the 1st-person fact that it feels like something to
have/be a mind.

But this 1st-person fact is precisely the one to which the 3rd-person
task of reverse engineering can never have any direct empirical access;
only its functional correlates are available
(Harnad
2000c).
Yet it is indeed a fact, as is clearly illustrated if we contrast the
case of quarks with the case of qualia ("qualia" is the name
philosophers have given to the qualitative contents of our mental
experiences: they are what it feels like to have a mind, experience
experiences, think thoughts, mean meanings; Nagel 1974; Harnad 1993c):

24. Quarks, Qualia, and Causality

What we are actually asking about each of our nine candidates is whether
or not it really has qualia. Is this like asking, about the Utopian world-models,
whether there really are quarks in the world (i.e., whether the quarkful
GUTE is the right GUTE)? Yes, but it's a bit worse in the case of mind-models,
because whereas the reality or nonreality of (unobservable-in-principle)
quarks, apart from the formal predictive/explanatory role they play (or
don't play) in their respective GUTEs, makes and can make no palpable difference
to anyone (I chose the word "palpable" advisedly), this is decidedly untrue
of qualia. For surely it would make a world of palpable difference to
one of the nine candidates,
if we got our inferences wrong, and assumed it
did not have qualia, when it in reality did! And each of us knows, palpably,
precisely what that difference is -- unlike the nebulous difference that
the existence or non-existence of quarks would make, given the total predictive/explanatory
power of its GUTE.)

So, metaphysically (as opposed to empirically or intuitively), more
is at stake in principle with qualia than with quarks -- which is why mind-modelling
has a Zombie Problem whereas world-modelling does not (Harnad 1995b).

25. Mind-Blindness, the Blind Watchmaker, and Mind over Matter

And there is a further difference between the case of quarks and the
case of qualia: quarks, whether or not they exist, do play both a formal
and a causal role in their GUTE. Qualia can play neither a formal nor a
causal role in any mind-model, be it T3, T4 or T5. They are undeniably
there in the world, but they can play no explicit role in the reverse-engineering,
on pain of telekinetic dualism or worse
(Harnad 1982b;
Alcock 1987)! This too is implicit in Turing's insistence on functional
criteria alone.

To put in context the mind-blind functionalism that Turing indistinguishability
imposes on us, it is instructive to note that the Blind Watchmaker who
forward-engineered us all (Darwinian Evolution) was a strict functionalist
too, and likewise no mind-reader. Natural Selection is as incapable of
distinguishing the Turing-indistinguishable as any of the rest of us (Harnad
2000a).

26. Turing's Text and Turing's Mind

This completes our exegesis of Turing (1950).
Was this really the author's intended meaning? Perhaps not, but if not,
then perhaps it ought to have been! His text is certainly systematically
interpretable as meaning and implying all of this. But if Turing did not
actually have it all in mind, then we are making a wrong pen-pal inference
in imputing it to him -- and it should instead be attributed only to the
inert code he generated in the form of his celebrated paper.

FOOTNOTES

Footnote 1. One of the referees for
this article suggested that I respond to the following three published
criticisms by Hauser (1993),
so I hereby do so:

(i) It is incoherent to hold that (a) robotic (T3) capability is causally
necessary for linguistic (T2) capability but that (b) T2 is insufficient
to evidence intelligence and T3 is.
Nothing is incoherent here at all. If my symbol-grounding arguments and
evidence that linguistic capacity must be grounded in robotic capacity
are correct, then it follows that only a candidate that has the capacity
to pass T3 will have the capacity to pass T2, even though T2 does not test
T3 (robotic) capacity directly. This hypothesis is not incoherent (but,
like any empirical hypothesis, it is not necessarily correct either). Let
us call it hypothesis (a).

So far we have mentioned only sufficiency to pass T2. Sufficiency "to
evidence intelligence" is the further question of whether the T2-passing
candidate would indeed have a mind. Let us call this Turing's conjecture
(b).

I subscribe to Turing's conjecture (b) too, with one special exception,
namely, should my grounding hypothesis (a) be incorrect, and should it
turn out that an ungrounded, implementation-independent symbol system alone
could pass T2 after all, then it would not have a mind (as shown by Searle's
Periscope).

This invalidates T2 as evidence of having a mind in the case of one
very special kind of candidate only, a purely computational one. All the
possibilities and contingencies, however, are perfectly coherent.

(ii) It is also incoherent to hold that (a) no public test is any
evidence of consciousness but (b) T3 provides better evidence that T2.
First, from the grounding hypothesis (a), it follows that only a candidate
with T3 capacity will be able to pass T2. But there is still the question
of how to reverse-engineer T2 capacity. If T2 capacity is indeed grounded
in T3 capacity (even though T2 does not test it directly), it follows that
to successfully reverse-engineer T2 capacity we must reverse-engineer T3
capacity.

Turing's conjecture (b) is that we do not have a better basis for inferring
that a candidate has a mind (consciousness) than T2/T3 capacity. T3 is
a "better" basis for reverse-engineering that capacity than T2 alone (for
T2 is a subset of T3, and grounded in it); it is also a better basis for
testing success in reverse-engineering it (it would be easier to catch
out a robot than a pen-pal, both because a robot's capacities subsume a
pen-pal's capacities, and because they give us more to go on, in mind-reading
for anything that distinguishes the candidate from a real person).

But a positive outcome to T3 still leaves open the distinct possibility
that the candidate is mindless (i.e., that Turing's conjecture (b) is false)
-- a possibility that is trivial when it comes to Turing-Testing members
of our own species (unless they are comatose), but one that becomes more
substantial as we move toward species less and less like ourselves; the
only way to know whether a polyp, or a plant, feels anything when you pinch
it is to be that polyp or plant. The same is true of a toy robot;
but as its capacities become more and more like our own, can we help ourselves
to the same easy confidence we have with natural members of our own species
(Harnad 1985)?

All of this is perfectly coherent. We don't need 1st-hand evidence of
consciousness in other members of our species because we are designed with
mind-reading capacities that leave no room for doubt. But when we deploy
those same mind-reading capacities on species or artifacts that they were
not designed for (by the Blind Watchmaker), they could lead us astray.
Turing is right to remind us, though, that only a fool argues with the
indistinguishable.

Otherwise put: Because of the other-minds problem, we will never have
access to positive evidence about consciousness in any case but our own.
But, on Turing's conjecture, in other cases, absence of Turing Indistinguishability
could be taken as negative evidence.

(iii) T3 is no more immune to Searle's putative counterexamples than
T2, so Searle's putative counterexample cannot motivate ascent from T2
to T3.
The only kind of candidate for having a mind that is vulnerable to Searle's
periscope is an implementation-independent symbol system. By definition,
such a system could (at most) only pass T2, not T3 (which requires a body,
which is not implementation-independent); so T3 is immune to Searle's Periscope
whereas T2 is not (for that special kind of candidate).

Footnote 2. Having appealed to our
intuitions in this particular hypothetical case, it is unavoidable to go
on to raise the question of whether we would feel comfortable (on the strength
of Searle's Chinese Room Argument) in pulling the plug on our lifelong
T2 pen-pal, if we found out that he was just a computer? Here our evolved
"theory-of-mind" intuitions -- our Turing intuitions -- are utterly at
odds with both our rational analysis and our real-life experience to date.
Some people are even loath to pull the plug on cyberplants.

So of the two aspects of the Turing Test -- the empirical one, of successfully
reverse-engineering our full performance capacity, and the intuitive one,
of being unable to discern any difference between the candidate and any
other person with a mind -- the intuitive, "mind-reading" aspect would
be tugging hard in the case of a T2-passing computer, despite Searle's
periscope, which unmasks the symbol system, but not the computer. For in
the case of the computer pen-pal, we could rationalise the outcome by saying
"So it's that computer implementation that has a mind, not just
the symbol system it is implementing" (thereby merely jettisoning implementation-independence,
and Searle's periscope with it).

So what if the code was being ported to a different computer every day?
Our knowledge about transplants and cloning, plus our telefantasies about
teleportation would still restrain us from pulling the plug (reinforced
by the asymmetry between the relative consequences of false positives vs.
false negatives in such capital decisions).

And what if it turned out to have been Searle mindlessly executing the
code all those years, and telling us now that he had never understood a
word? Well then the question would be moot, because there would be nothing
left to pull the plug on anyway (although most of us would feel that we'd
still like to send Searle back to the Room so we could write one last letter
to our pen-pal under even those conditions, to find out what he
had to say about the matter -- or at least to say goodbye!)

We are contemplating hypothetical intuition clashes here; so perhaps
we should wait to see what candidates can actually pass the TT before needlessly
committing ourselves to any counterfactual contingency plans.

REFERENCES

Alcock, J.E. (1987) Parapsychology: Science of the anomalous or search
for the soul? Behavioral and Brain Sciences 10 (4): 553-565.