Early AI Programs

The first working AI programs were written in the UK by Christopher Strachey, Dietrich Prinz, and Anthony Oettinger. Strachey was at the time a teacher at Harrow School and an amateur programmer; he later became Director of the Programming Research Group at Oxford University. Prinz worked for the engineering firm of Ferranti Ltd, which built the Ferranti Mark I computer in collaboration with Manchester University. Oettinger worked at the Mathematical Laboratory at Cambridge University, home of the EDSAC computer.

Strachey chose the board game of checkers (or draughts) as the domain for his experiment in machine intelligence. Strachey initially coded his checkers program in May 1951 for the pilot model of Turing's Automatic Computing Engine at the National Physical Laboratory. This version of the program did not run successfully; Strachey’s efforts were defeated first by coding errors and subsequently by a hardware change that rendered his program obsolete. In addition, Strachey was dissatisfied with the method employed in the program for evaluating board positions. He wrote an improved version for the Ferranti Mark I at Manchester (with Turing's encouragement and utilising the latter's recently completed Programmers' Handbook for the Ferranti computer). By the summer of 1952 this program could, Strachey reported, "play a complete game of Draughts at a reasonable speed".

Prinz's chess program, also written for the Ferranti Mark I, first ran in November 1951. It was for solving simple problems of the mate-in-two variety. The program would examine every possible move until a solution was found. On average several thousand moves had to be examined in the course of solving a problem, and the program was considerably slower than a human player.

Turing started to program his Turochamp chess-player on the Ferranti Mark I but never completed the task. Unlike Prinz's program, the Turochamp could play a complete game and operated not by exhaustive search but under the guidance of rule-of-thumb principles devised by Turing.

Machine learning

Oettinger was considerably influenced by Turing's views on machine learning. His "Shopper" was the earliest program to incorporate learning (details of the program were published in 1952). The program ran on the EDSAC. Shopper's simulated world was a mall of eight shops. When sent out to purchase an item Shopper would if necessary search for it, visiting shops at random until the item was found. While searching, Shopper would memorise a few of the items stocked in each shop visited (just as a human shopper would). Next time Shopper was sent out for the same item, or for some other item that it had already located, it would go to the right shop straight away. As previously mentioned, this simple form of learning is called "rote learning" and is to be contrasted with learning involving "generalisation", which is exhibited by the program described next. Learning involving generalisation leaves the learner able to perform better in situations not previously encountered. (Strachey also investigated aspects of machine learning, taking the game of NIM as his focus, and in 1951 he reported a simple rote-learning scheme in a letter to Turing.)

The first AI program to run in the U.S. was also a checkers program, written in 1952 by Arthur Samuel of IBM for the IBM 701. Samuel took over the essentials of Strachey's program (which Strachey had publicised at a computing conference in Canada in 1952) and over a period of years considerably extended it. In 1955 he added features that enabled the program to learn from experience, and therefore improve its play. Samuel included mechanisms for both rote learning and generalisation. The program soon learned enough to outplay its creator. Successive enhancements that Samuel made to the learning apparatus eventually led to the program winning a game against a former Connecticut checkers champion in 1962 (who immediately turned the tables and beat the program in six games straight).

To speed up learning, Samuel would set up two copies of the program, Alpha and Beta, on the same computer, and leave them to play game after game with each other. The program used heuristics to rank moves and board positions ("looking ahead" as many as ten turns of play). The learning procedure consisted in the computer making small numerical changes to Alpha's ranking procedure, leaving Beta's unchanged, and then comparing Alpha's and Beta's performance over a few games. If Alpha played worse than Beta, these changes to the ranking procedure were discarded, but if Alpha played better than Beta then Beta's ranking procedure was replaced with Alpha's. As in biological evolution, the fitter survived, and over many such cycles of mutation and selection the program's skill would increase. (However, the quality of learning displayed by even a simple living being far surpasses that of Samuel's and Oettinger's programs.)

Evolutionary computing

The work by Samuel just described was among the earliest in a field now called evolutionary computing and is an example of the use of a geneticalgorithm or GA. The term "genetic algorithm" was introduced in about 1975 by John Holland and his research group at the University of Michigan, Ann Arbor. Holland's work is principally responsible for the current intense interest in GAs. GAs employ methods analogous to the processes of natural evolution in order to produce successive generations of software entities that are increasingly fit for their intended purpose. The concept in fact goes back to Turing's manifesto of 1948, where he employed the term "genetical search". The use of GAs is burgeoning in AI and elsewhere. In one recent application, a GA-based system and a witness to a crime cooperate to generate on-screen faces that become closer and closer to the recollected face of the criminal.

Reasoning and problem-solving

The ability to reason logically is an important aspect of intelligence and has always been a major focus of AI research. In his 1948 manifesto, Turing emphasised that once a computer can prove logical theorems it will be able to search intelligently for solutions to problems. (An example of a simple logical theorem is "given that either X is true or Y is true, and given that X is in fact false, it follows that Y is true".) Prinz used the Ferranti Mark I, the first commercially available computer, to solve logical problems, and in 1949 and 1951 Ferranti built two small experimental special-purpose computers for theorem-proving and other logical work.

An important landmark in this area was a theorem-proving program written in 1955-1956 by Allen Newell and J. Clifford Shaw of the RAND Corporation at Santa Monica and Herbert Simon of the Carnegie Institute of Technology (now Carnegie-Mellon University). The program was designed to prove theorems from the famous logical work Principia Mathematica by Alfred North Whitehead and Bertrand Russell. In the case of one theorem, the proof devised by the program was more elegant than the proof given by Whitehead and Russell.

The Logic Theorist, as the program became known, was the major exhibit at a conference organised in 1956 at Dartmouth College, New Hampshire, by John McCarthy, who subsequently became one of the most influential figures in AI. The title of the conference was "The Dartmouth Summer Research Project on Artificial Intelligence". This was the first use of the term "Artificial Intelligence". Turing's original term "machine intelligence" has also persisted, especially in Britain.

Newell, Simon and Shaw went on to construct the General Problem Solver, or GPS. The first version of GPS ran in 1957 and work continued on the project for about a decade. GPS could solve an impressive variety of puzzles, for example the "missionaries and cannibals" problem: How are a party of three missionaries and three cannibals to cross a river in a small boat that will take no more than two at a time, without the missionaries on either bank becoming outnumbered by cannibals? GPS would search for a solution in a trial-and-error fashion, under the guidance of heuristics supplied by the programmers. One criticism of GPS, and other programs that lack learning, is that the program's "intelligence" is entirely second-hand, coming from the programmer (mainly via the heuristics, in the case of GPS).

Natural language communication

Two of the best-known early programs are Eliza and Parry. Details of both were first published in 1966. These programs gave an eerie semblance of conversing intelligently. Parry, written by Stanford University psychiatrist Kenneth Colby, simulated a human paranoiac. Parry's responses are capitalised in the following extract from a "conversation" between Parry and a psychiatric interviewer.

Why are you in the hospital?
I SHOULDN'T BE HERE.
Who brought you here?
THE POLICE.
What trouble did you have with the police?
COPS DON'T DO THEIR JOB.
What do they do instead?
COPS ARREST THE WRONG PEOPLE.
They arrest the right ones sometimes.
THE LAW SHOULD GET THOSE ITALIAN CROOKS.
Are you Italian?
MY BACKGROUND IS BRITISH BUT I WAS BORN IN THIS COUNTRY.
Do you know any Italian crooks?
I TRY TO AVOID THE UNDERWORLD.

Psychiatrists who were asked to decide whether they were communicating with Parry or a human paranoiac were often unable to tell.

Eliza, written by Joseph Weizenbaum at MIT, simulated a human therapist. In the following extract, Eliza "speaks" second.

Men are all alike.
IN WHAT WAY
They're always bugging us about something or other.
CAN YOU THINK OF A SPECIFIC EXAMPLE
Well, my boyfriend made me come here.
YOUR BOYFRIEND MADE YOU COME HERE
He says I'm depressed much of the time.
I AM SORRY TO HEAR THAT YOU ARE DEPRESSED
It's true. I am unhappy.
DO YOU THINK COMING HERE WILL HELP YOU NOT TO BE UNHAPPY
I need some help, that much seems certain.
WHAT WOULD IT MEAN TO YOU IF YOU GOT SOME HELP

Neither Parry nor Eliza can reasonably be described as intelligent. Parry's contributions to the conversation are "canned"--constructed in advance by the programmer and stored away in the computer's memory. As the philosopher Ned Block says, systems like Parry are no more intelligent than is a juke box. Eliza, too, relies on canned sentences and simple programming tricks (such as editing and returning the remark that the human participant has just made).