Σχόλια 0

Το κείμενο του εγγράφου

The Evolutionary Emergenceroute toArtificial IntelligenceAlastair ChannonDegree: MSc in Knowledge-Based Systems 1995/96SCHOOL OFCOGNITIVE ANDCOMPUTINGSCIENCESUNIVERSITY OFSUSSEXSupervisor: Inman HarveySubmitted: 2 September 1996(Minor revisions October 1996)AbstractThe artificial evolution of intelligence is discussed with respect to currentmethods. An argument for withdrawal of the traditional ‘fitness function’ ingenetic algorithms is given on the grounds that this would better enable theemergence of intelligence, necessary because we cannot specify whatintelligence is. A modular developmental system is constructed to aid theevolution of neural structures and a simple virtual world with many of theproperties believed beneficial is set up to test these ideas. Resulting emergentproperties are given, along with a brief discussion.Keywords:Artificial Intelligence, Emergence, Genetic Algorithms,Artificial Life, Neural Networks, Development, Modularity,Fractals, Lindenmayer Systems, Recurrence.iiAcknowledgmentsThanks to my supervisor Inman Harvey for his encouragement in the area and comments on this project.iiiCONTENTS1 INTRODUCTION.............................................................................................................................11.1 Aims.....................................................................................................................................................11.2 Rationale and Outline of the Dissertation...........................................................................................2I GENETIC ALGORITHMS AND ARTIFICIAL NEURAL NETWORKS.............32 GENETICALGORITHMS................................................................................................................32.1 Conventional GAs................................................................................................................................32.2 Variable length genotypes (required for emergence)..........................................................................42.3 Genetic Programming..........................................................................................................................43 ARTIFICIALNEURALNETWORKS.................................................................................................53.1 Conventional ANNs..............................................................................................................................53.2 Recurrent ANNs...................................................................................................................................63.3 The ANNs used in this project..............................................................................................................6II EVOLUTIONARY EMERGENCE............................................................................74 EMERGENCE.................................................................................................................................75 EVOLUTIONARYEMERGENCE......................................................................................................75.1 An Example: ‘Farmers and Nomads’..................................................................................................8III DEVELOPMENT, MODULARITY AND FRACTALS IN ANNS.......................96 WHY THIS IS RELEVANT...............................................................................................................97 AN OVERVIEW OF MODULAR DEVELOPMENTAL STRATEGIES FORANNS....................................97.1 Frédéric Gruau’s Cellular Encoding.................................................................................................107.2 Cellular Automata..............................................................................................................................107.3 Lindenmayer Systems.........................................................................................................................118 LINDENMAYERSYSTEMS...........................................................................................................118.1 Kitano’s Graph-Generating Grammar for evolving ANN connectivity matrices...............................128.2 Boers’ and Kuiper’s L-systems for evolving ANNs............................................................................12IV CENTRAL IDEAS OF THE PROJECT................................................................149 ARTIFICIALINTELLIGENCE CALLS FOREVOLUTIONARYEMERGENCE......................................1410 EVOLUTIONARYEMERGENCE CALLS FOR THE DISMISSAL OFEXPLICITFITNESS FUNCTIONS.1411 EVOLUTIONARYEMERGENCE OFAICALLS FORDEVELOPMENTALMODULARITY.................1512 DEVELOPMENTALMODULARITY CALLS FORL-SYSTEMS........................................................15V THE LINDENMAYER SYSTEMS USED...............................................................1613 DETAILS OF THELINDENMAYERSYSTEMS USED.....................................................................1614 GENETICENCODING OF THEPRODUCTIONRULES...................................................................1715 INITIALEXPERIMENT TO VERIFY THE SYSTEM SO FAR.............................................................18VI THE MAIN EXPERIMENT....................................................................................1916 DETAILS OF THEINDIVIDUALS BEING EVOLVED......................................................................1917 OUTLINE OF THEGENETICALGORITHM USED.........................................................................2018 IMPLEMENTATION....................................................................................................................2219 RESULTS...................................................................................................................................23VII CONCLUSIONS AND SUGGESTIONS FOR FURTHER WORK..................2420 CONCLUSIONS..........................................................................................................................2421 SUGGESTIONS FORFURTHERWORK........................................................................................24BIBLIOGRAPHY...........................................................................................................25ivAPPENDIX 1:Sample Output from the Main ExperimentAPPENDIX 2:Code of the Main ExperimentAPPENDIX 3:Sample Output from the Initial Experiment of chapter 15APPENDIX 4:Code of the Initial Experiment of chapter 15FiguresFigure 1: The difference between GAs for function optimisation and natural evolution ______________ 1Figure 2: Schematic block diagram of the neurons used in this project: from [Cliff et al. 1993] _______ 6Figure 3: A context-sensitive L-system production rule template _______________________________ 11Figure 4: An example L-system development: from [Boers et al. 1993]__________________________ 13Figure 5: Template of the Lindenmayer system production rules used in this project _______________ 16Figure 6: Example of the genome decoding method used in this project _________________________ 1711 Introduction1.1 AimsThis project is directed towards my long-term goal of developing artificial intelligences(AIs) that are capable of more than just a small set of tasks and can grasp profoundly newsituations on their own. My interest is in intelligence such as ours and ants’ rather thandedicated chess computers or current expert systems.The project was aimed at the heart of the problem: forming a general system that wouldadapt to behave in an intelligent way within an environment, without being given anyinformation about how to behave. The behaviour would have to emerge from thecombination of the system and its environment.I planned to achieve this by adapting genetic algorithms (GAs) from their conventionalform towards natural evolution. I aimed to achieve emergence as in natural evolution,via coexistence of similarly-capable systems, rather than GAs’ usual operation offunction optimisation.GAs for function Natural EvolutionoptimisationExample: if one of two species which share resources improvessufficiently, the other either improves or becomes extinctfitness proportionof populationtime time

NOT GOOD

GOOD FOR EMERGENCEFOR EMERGENCE(DO NOT HAVE TO SPECIFY WHAT IS GOOD)Figure 1: The difference between GAs for function optimisation and natural evolutionIn summary, the end results aimed for were demonstrable emergent properties, arisingfrom the interactions between the items being evolved (within their environment),including some that could be considered intelligent.21.2 Rationale and Outline of the DissertationA genetic algorithm has to have objects to evolve. The choice of a suitable class ofobjects was made along the same lines as the choice of method used (GA): base it onsomething that has already worked. Natural evolution has evolved biological systemsthat run on massively parallel, low speed computation with a low number of processingsteps. Artificial neural networks (ANNs) are an attempt to produce systems that work ina similar way to biological systems and so are well suited to this project.Part I gives the primary background to this project: genetic algorithms (chapter 2) andartificial neural networks (chapter 3), including the type of ANNs used in this project.Parts II and III build on this, providing further background relating to GAs and ANNsrespectively.Part II discusses emergence, especially evolutionary emergence. An example ofevolutionary emergence in a GA is given.Part III discusses development and modularity in ANNs. It is explained why this isimportant to this work. This can be seen as the third example of taking note of what hasworked in natural evolution, although this is not the only reasoning given. Considerationand coding of developmental modularity constituted approximately half of the workundertaken on this project.By the end of part III, the reader may well have settled on the central ideas of the projectas given in part IV. The words ‘calls for’ were chosen carefully in the chapter headingsof part IV; they do not mean ‘needs’ but rather ‘would be very well served by’ , or‘should urge us to use’.One topic that will not be discussed there is ‘situation within a world’. Rod Brooks[1991a,b], one of the main figures in non-traditional AI, puts forward a strong argumentthat incremental development (including evolution) must take place within the world thatthe objects are to operate in. This is to avoid the problem traditional AI often has ofthere being a gap between the objects and the world. So, for example, some researchersargue that robots that are to operate in the real world must be evolved (or at leastevaluated) in the real world. However, in this project the ANNs are only ever to operatein a virtual world and so there should be no concern about them being evolved in thatvirtual world. It is not a simulation and so suffers none of the problems that occur whentrying to use a simulation to evolve robots for the real world. Where the virtual worlddiffers from our world (however greatly), there is no unfortunate error. There is simply adifference.The post-contemplation work begins in part V, with details of the developmentalmodularity system designed for this project. Part VI describes the main experiment (anon-conventional GA) itself, which uses the developmental modularity system of part V.This was designed to allow the emergence of properties that could be consideredintelligent, as was the aim of the project. Conclusions are given in the final part (partVII).3I GENETIC ALGORITHMS ANDARTIFICIAL NEURAL NETWORKS2 Genetic Algorithms2.1 Conventional GAsConventional genetic algorithms (GAs) are search algorithms inspired by naturalevolution. They perform (fairly) well at a wide range of difficult optimisation problems,which is currently what most GAs are used for. John Holland [1992,1993] developed thebasis of the genetic algorithm (suited to evolution by both mating and mutation) in theearly 1960s.GAs evolve an initially random population of solutions (to whatever the problem is) bypicking which individuals (solutions) will live on and/or mate into the next generation.To do this they evaluate the individuals’ ‘fitnesses’ via some procedure relevant to theproblem. The fitter individuals reproduce, replacing less-fit individuals which perish.Then the cycle starts over again, starting with the resulting population from the lastgeneration. In most GAs, the population size remains constant. There are two verycommon operators used at reproduction: crossover and mutation.Evolution is greatly speeded up when mating is used to combine the fit solutions, ratherthan just reproducing an individual from another one. Simple (single-point) crossover,whereby an individual inherits its code (chromosome) up to a random cut point from oneparent and from the other parent after the cut point achieves this. This is biologicallyfaithful as animal chromosomes cross over in this way when two gametes (sperm and eggin humans) meet to form a zygote (which forms the embryo).During reproduction a gene may randomly change, with low probability. This is calledmutation and is also a biologically faithful idea. In observed biological systems, manymutations are neutral, that is they have no (or negligible) effect on the phenotype. This isnot the case in most GAs as most GAs are used for some sort of function optimisation.Crossover and mutation are the genetic operators used in this project. Other biologicallyfaithful genetic operators exist. For example inversion increases linkage (reduces thelikelihood that crossover will separate genes that need to occur together) and adds non-destructive noise that helps crossover to escape local optima. Another example istransposition, which increases duplication. These operators are not used in this project.There are many minor variations on the conventional GA. Many different selectionmethods are used, such as rank-based (top n% survive to next generation) and fitness-proportional (number of offspring is proportional to fitness). The population may bedistributed, individuals reproducing with nearby individuals, their children being bornnearby. Chapter 15 describes a (basically) conventional distributed GA using a non-generational selection method, which I wrote to test the developmental modularitysystem before writing the main experiment. Should more information on conventionalGAs be required, Goldberg [1989] is a good introductory text.42.2 Variable length genotypes (required for emergence)Most conventional GAs set out to solve a well-defined optimisation problem. For suchproblems the individuals (solutions) are generally encoded on a fixed-length genotype,often directly (using a bijection between the solution set and the chromosome set). Butwe wish to go beyond using GAs to solve A problem, towards using them for systemsthat do well in general, so giving rise to intelligent behaviour. GAs with fixed-lengthgenotypes cannot be used to evolve increasingly impressive systems as natural evolutionhas done, so we need to use variable-length genotypes.The use of variable-length genotypes in GAs is not a new idea. The subject of how thelengths should change has been addressed by Inman Harvey's Species Adaptation GeneticAlgorithm (SAGA) theory [1992a,b,c]. He argues (and demonstrates) that the changes ingenotype length should take place much slower than crossover’s fast mixing ofchromosomes. The population should be nearly converged, evolving as species;mutation rates should be low enough that they do not disperse the species (in genotypespace) or hinder the assimilation, by crossover, of good mutations into the species. Theidea of species is not engineered in, but rather a result of this theory. A new speciescomes about when a progenitorial species splits into separate ones. A species becomesextinct when all its members die.In slightly more detail, taking into account gene duplication and genotype to phenotypesystems (ontogenesis), Harvey is arguing that the complexity of the informationcontained within the genomes of a species should change slowly, relative to theassimilation of advantageous mutations.2.3 Genetic ProgrammingGenetic Programming (GP) is the application of GAs to evolving programs. The bestknown advocate of GP is Koza [1990,1992]. Individuals are commonly LISPS-expressions and so the genotypes have a tree structure rather than the more normallinear string. Crossover swaps complete sub-trees, so always producing syntacticallycorrect children. If the sub-trees swapped are of different sizes then the child will(probably) not have the same genotype size as either parent.Rodney Brooks proposed, in [Brooks 1992], an extension of Koza’s ideas with the aim ofevolving robots. However, his Behavioural Language (BL) in effect forms blueprintsrather than ‘recipes’ for the robots’ networks. This is highly objectionable on groundsthat will be given in part III.Frédéric Gruau [1996] uses GP to evolve his ‘cellular programming language’ code todevelop artificial neural networks. See section 7.1 for a discussion of this.53 Artificial Neural NetworksArtificial Neural Networks (ANNs) are an attempt to produce systems that work in asimilar way to biological nervous systems. Biological nervous systems are built fromneurons which are very much slower than electronic components; the power of animals’brains comes from massive parallel processing. The standard introduction to ANNs isprovided by Rumelhart, McClelland and the Parallel Distributed Processing ResearchGroup at the University of California [Rumelhart et al. 1992].3.1 Conventional ANNsConventional ANNs contain a number of simple processing units called nodes. Eachnode has an output, which may be propagated to other nodes via a weighted link, and anumber of inputs from other nodes. The inputs to a node are each multiplied by theweight of the relevant link and then summed to produce the total input to the node. Thenode’s output is then a function of this total input (including any external inputs), anymemory term (such as a proportion of the previous output) and any node constants, suchas threshold. For example, the following sigmoid output function is often used:

node output =1 _ .1+e-(total node input - node threshold)The structure of most ANNs has traditionally been layered, with all of a node’s linksbeing to nodes in higher layers; there a no cycles in the network. Without loss ofgenerality, we can talk of an ‘input layer’ and an ‘output layer’, with all intermediatelayers being ‘hidden layers’. Learning in these ANNs is most commonly via supervised‘backpropagation’: to train the network, example input-output pairs are presented inturn; for each pair, the networks’ outputs are calculated by forward-propagating theinputs (calculating nodes’ outputs) through the hidden layers to the output layer; gradientdescent is used on the length of the error vector (actual outputs - example outputs) toadjust the weights just before the output layer; then the errors are back-propagated to thelayer below by summing the weighted errors from nodes-output-to and gradient descentapplied to adjust the next layer of weights; this is continued until the weights just abovethe input nodes have been adjusted; then the process is repeated with the next trainingpair.Other sorts of ‘neural networks’ are also common. For example radial basis function(RBF) networks are similar to the above, but use (commonly) a Gaussian outputfunction. So each node has two variables (field centre and field width) which gradientdescent is applied to as well as link weights. Self-organising maps such as Kohonennetworks are often bundled under the same heading as conventional neural networks.Most of these other network classes are, however, further removed from the biologicalnervous system background of ANNs, and are less suitable models for work such as thisproject.63.2 Recurrent ANNsRemoving the feed-forward (layered) constraint of conventional ANNs results inrecurrent networks. That is networks with link-cycles in them. Recurrent networks canhave internal state sustained over time and demonstrate rich intrinsic dynamics. Thismakes them attractive for use in adaptive behaviour work. Evidence from neuroscienceis also on their side, showing that most biological neural networks are recurrent.Whilst recurrent ANNs can be very hard to study and construct manually, artificialevolution should not have any problem using them. Indeed, there seems to be littlereason to constrain the evolution to feed-forward networks, especially when aiming forautonomous agents that are to act as complex dynamical systems that work within a timeframe.3.3 The ANNs used in this projectThe ANNs used in this project are recurrent networks of nodes as used successfully(hence their choice here) by Inman Harvey, Dave Cliff and Phil Husbands in theirevolutionary robotics work [Cliff et al. 1992,1993,1996; Harvey et al. 1992]. Theyevolved recurrent networks of these nodes for visual navigation tasks in simpleenvironments.Figure 2: Schematic block diagram of the neurons used in this project: from [Cliff et al. 1993]For details, please see [Cliff et al. 1993].All links have weight 1 for this project; no lifetime learning was used. This was to avoidthe criticism that learning was the main factor, rather than the evolution, as can beleveled at related work (see chapter 8.2). However, the importance of lifetime learning isrecognised in chapter 21 (Suggestions for further work).∑SumInhibitory0V1∑SumExcitatoryMultiply*Noise-0.10PDF+0.1U0.7501T201++DelayΔtInhibitoryDelayΔtExcitatory7II EVOLUTIONARY EMERGENCE4 EmergenceEmergence is relevant to artificial intelligence (AI) because it has become apparent,through AI work to date, that we do not understand enough about intelligence to be ableto program it into a machine. Therefore AI must aim either to increase our understandingabout intelligence to a level such that we can program it into a device, or to build adevice which outperforms the specifications that we give it. The first approach is,presumably, the one being taken by researchers in the field of traditional AI. The secondis a newer approach and the approach taken in this project.Emergence relates to unexpected, unprogrammed, behaviours. However this is not thebest way to define emergence, because it depends on the predictive ability of the observerand demands a once-only instance of emergence. Steels [1994] gives temperature andpressure as examples of emergent properties that would not be classified as emergent bysuch a definition. He uses emergence to refer to ongoing processes which produceresults requiring vocabulary not previously involved in the description of the system’sinner components. This is the meaning used throughout this project.5 Evolutionary EmergenceConventional genetic algorithms (GAs) use problem-specific evaluation functions.Natural evolution has no (explicit) evaluation functions. Individuals simply live untilthey die, reproducing, or not, during this time. As a result of organism-environmentinteractions, certain behaviours fare better than others. This is how individuals are`evaluated` and how the non-random cumulative selection works without any long-termgoal. It is also why new abilities can emerge.Aiming at such emergence in GAs seems to be the most promising route to building adevice which outperforms the specifications that we give it (see above). I therefore seeevolutionary emergence as the foremost hope for the development of artificialintelligences of the human/ ant brain variety (as opposed to the chess computer variety).85.1 An Example: ‘Farmers and Nomads’Sannier and Goodman [1987] used a GA to evolve genomes within an artificial world.Each individual (genome) was given the ability to detect the conditions in its internal andimmediate external environments. The population exists within a two dimensionaltoroidal world containing ‘food’. An individual’s ‘strength’, which is taken (deducted)from its parents’ at birth, increases on consumption of food and decrease in each timestep (and upon reproduction). To reproduce, an individual’s strength must be above athreshold; an individual dies if its strength drops below a lower threshold.The genomes encode rules which allow them to move in one of eight directions(N,NE,E,SE,...) with conditional program branching based on whether or not there isfood in any of the eight neighbouring locations.In the experiment reported, food was restricted to two ‘farm’ areas, spaced apart in thetoroidal world. The level of food in a farm varied periodically; when one farm washaving its ‘summer’ the other would be having its ‘winter’. Also, a farm’s potential islower the more it was either over-consumed or neglected (under-consumed) during theprevious period.Out of the evolution emerged two classes of individual: ‘farmers’ who staid put in one ofthe farms, their farm populations rising and falling with the ‘seasons’, and ‘nomads’ whocircled the world, moving in such a way that they passed through both farms during theirsummers. The nomad population would increase as it went through a farm and decreaseas it moved through the area without food. Groups of individuals of each class wereextracted from the total population and tested in the absence of the other class. Whilstfarmers could survive without nomads, it was found that nomads needed the farmers sothat the farms would not be neglected between visits.The (relevant) important thing in this work is the emergence of the two classes ofindividual. Never was it specified that they should come about. The evolution resultedin them simply because they do better than other solutions within the environment. Theonly information given, above the genetic algorithm and external environment, was thepossible actions (moves) and conditional (food in neighbourhood?). It could be said thatthe system outperformed the specifications that were given it.9III DEVELOPMENT, MODULARITY ANDFRACTALS IN ANNS6 Why this is relevantThe brain is modular at several levels. For example the cerebral cortex contains macromodules of hundreds of minicolumns, each a module containing approximately onehundred neurons. There are known examples of neural structures that serve a purposedifferent from their original use, for example [Stork et al. 1991]. When one considersanimals as a whole, it is clear that most properties evolved from ancestral propertieswhich had slightly different functions. For example, if we could trace far enough backup our evolutionary tree, we would not expect ears to have suddenly appeared on anindividual which was in a situation where ears would be useful. Similarly then, we canexpect all (or at least most) neural structures to be descended from neural structureswhich once had a different use.Evidence from gene-theory tells us that genes are like a recipe, not a blueprint. In anyone cell, at any one stage of development, only a tiny proportion of the genes will be inuse. Further, the effect that a gene then has depends upon what the cell can affect - thatis, what the cell’s neighbours are.The above two paragraphs are related: For a type of module to be used for a novelfunction (and then to continue to evolve from there), and yet still perform its current one,either an extra module must have been created or there must have been one ‘spare’.Either way, a duplication system is required. This could be either by gene duplication oras part of a developmental process - a recipe. Gene duplication can be rejected as a solesource of neural module duplication, because our genes do not have the capacity to storeall specific connections without a modular coding [Boers and Kuiper 1992]. Therefore,we come to the conclusion that for the effective evolution of neural structures, adevelopmental process is required.7 An overview of modular developmentalstrategies for ANNsThere are currently three main approaches to the modular development of ANNs: cellularencoding, cellular automata and Lindenmayer systems. The first two are examined andargued against in the first two sections of this chapter. The third section introducesLindenmayer systems and argues for them in this field; the next chapter gives a more fullintroduction to them. The points made in chapter 6 should be kept in mind whilstreading this chapter.107.1 Frédéric Gruau’s Cellular EncodingFrédéric Gruau [1996] uses genetic programming (GP - see section 2.3) to evolve his‘cellular programming language’ code to develop modular artificial neural networks.The programs used are trees of graph-rewrite rules whose main points are cell divisionand iteration. Like many practitioners of GP, he considers (has been heard by the authorto say that) evolutionary algorithms can be used only as a tool in the design process.The crucial thing required for this project that is missing from Gruau’s approach isexactly what is missing from GP. Modularity can only come from either geneduplication (see objections in chapter 6) or iteration. But iteration is not a powerfulenough modular developmental backbone. Consider, for example, the cerebral cortex’smacro modules of hundreds of minicolumns mentioned in chapter 6. These arecomplicated structures that cannot be generated with a ‘repeat one hundred times:minicolumn’ style rule. There is variation between macro modules.So in GP, we are reduced to gene duplication for all but simple iterative structures. Whatis required is a rule of the sort ‘follow (rules X)’ where (rules X) is a marker for (pointerto) rules encoded elsewhere on the genotype. But this would be difficult to incorporateinto GP. A more sensible route would be to use a system which was designed to handlesuch rules. The systems in sections 7.2 and 7.3 are both such systems.7.2 Cellular AutomataThe use of conventional cellular automata (CA) for the construction of artificial neuralnetworks can be found in many books on CA. However, the results are always poornetworks. The interest here seems to be more in the area of neuron growth than thedevelopment of the network. Whilst in principle I cannot see any objection to evolvingCAs for ANN development, the amount of work involved was clearly too great for thisproject and appeared to offer no advantage over the system chosen: LindenmayerSystems (next section), which are related to cellular automata.CAM-Brain [de Garis 1993] is a project to ‘implement a cellular automata basedartificial brain with a billion neurons by 2001, which grows/ evolves at (nano-) electronicspeeds’ (from the abstract). Whilst the resulting technology may be of use in this field,there are two main problems with the CAM-Brain project as it is, in relation to thisproject. The first is that not enough consideration has been given to what will berequired to evolve intelligent behaviour. This single brain without any suitable way ofinteracting with other intelligences at (nano-) electronic speeds will be of little use for theevolution of intelligence. This could be overcome by using hundreds or thousands ofthese devices (possibly inside one machine). The second, more serious problem with thecurrent approach is that it involves feeding ‘signal cells’ along ‘trails’ (contained by‘sheath cells’). The signal cells would be determined by the genotype. This amounts to aprogram and, without serious thought of how to incorporate a ‘follow (rules X)’ stylerule, receives the same criticism and rejection (for work such as this project) as Gruau’sCellular Encoding.117.3 Lindenmayer SystemsLindenmayer systems (L-systems), which are introduced fully in the next chapter, can beseen as a relation to cellular automata, possibly a parent of (although not historically).They need not use the restriction that a cell always stays in the same position (so with thesame neighbouring positions) and commonly are not restricted to a regular gridarrangement.For this project, L-systems offer all the advantages of CAs, and more, and none of thedisadvantages. Also work on the evolution of L-systems as modular developmentstrategies for ANNs has already been undertaken with some success (see the nextchapter). I felt that they were the clear choice.8 Lindenmayer SystemsAs discussed in chapter 6, gene-theory has shown us that in nature genes are used as arecipe for each cell to follow. The development of each cell is determined by therelevant genes, which are determined by the cell’s immediate environment. All cells usethe same set of rules, the genes. This principle is related to that of fractals [Mandelbrot1982].Lindenmayer Systems [Lindenmayer 1968] (L-systems) were developed to model thebiological growth of plants. They are a class of fractals which apply rules (‘productionrules’) in parallel to the cells of their subject. Thus a specified ‘axiom’ subject (typicallyone or two cells) develops by repeated re-application of these rules. The use of context-sensitive production rules allows the rule used for a step in a cell’s development to bedetermined by the cell’s immediate environment. A context-sensitive production rulecommonly takes the following form:L < P > R→Swhere L = left-context P = predecessor (old cell)R = right-context S = successor (replacement cell or cells)Figure 3: A context-sensitive L-system production rule templateThe most specific production rule that matches a cell’s situation is applied.L-systems with more than two (left and right) contexts could exist but I have not as yetseen any, probably because no related application requires more than two contexts.128.1 Kitano’s Graph-Generating Grammar for evolving ANNconnectivity matricesKitano [1990] used a very simple form of L-system to evolve 2kx2kconnectivitymatrices. The rules had no contexts and took the form: P → (AB)(CD)The rules were encoded in blocks of five characters (eg. PABCD) on the genotype. Thenumber of rules on the genotype was variable. Development started with the axiomcharacter matrix ‘S’, which was programmed to always be the first character on thegenotype. After each developmental step, the matrix would have doubled in both widthand height. The final rules used would have lower case characters in the successor.These would be from {a,b,c,...,p}, which were constants representing the 16 possible 4-bit binary numbers to use in the connectivity matrix.Kitano demonstrated better results than direct encoding when evolving simple ANNs(such as XOR and simple encoders) using training by backpropagation. He also showedthat the number of rules could be small.One criticism of Kitano’s work must be that the resulting network architectures are justfully connected clumps of unconnected nodes; backpropagation is still doing most of thework.8.2 Boers’ and Kuiper’s L-systems for evolving ANNsBoers and Kuiper [1992] used L-systems to evolve modular feedforward networkarchitectures that were evaluated after training by backpropagation. If two modules wereset as connected, then they had links from every ‘output node’ of the first module toevery ‘input node’ of the second. A module’s ‘input nodes’ are those that do not receiveany input link from a node within the module. Similarly a module’s ‘output nodes’ arethose that do not send any output link to a node within the module.A fixed length alphabet was used for the rules. Letters A to H represented the possiblenode characters, numbers 0 to 5 the possible ‘skip’ sizes and the characters ‘[‘ and ‘]’grouped nodes and modules into larger modules. A skip number N after a moduleindicates that output links are to be made to the module N modules further down thecurrent network string. The restricted alphabet restricted the possible networkarchitectures but still produced some good results.13The left and right contexts are lists of nodes that the predecessor should be connected to.The predecessor could contain skips and modules and had to contain at least one letter.The successor could also contain skips and modules but could be empty.Figure 4: An example L-system development: from [Boers et al. 1993]The genetic encoding (or rather decoding) involved starting from each of the first fivebits of the genotype and reading the string six bits at a time. A translation table (basedloosely on the genetic code of RNA) was used to translate each six bits into a characterfrom the rule-set’s alphabet and “*”. Each minimal substring containing five *s encodesa production rule (*L*P*R*S*); invalid production rules are thrown out. Then thedecoding is repeated, using the genotype in reverse.The evolution of production rules used a conventional GA, with fixed-length genomesinitially randomised. The axiom “A” was always used. A limit of 6 rewrite passes overthe network string was used (and required by most genomes). One of the main fitnessfunctions used was testing on a simple character recognition problem: distinguishingbetween 3*3 pixel ‘T’s and ‘C’s rotated and embed within a 4*4 pixel array.One criticism of Boers’ and Kuiper’s work must be that the resulting networkarchitectures are just fully connected clumps of unconnected nodes; backpropagation isstill doing most of the work. It is possible, from their results, that the genetic algorithmis doing little more than constructing layers of nodes with the provision for links that skipintermediate layers.Another criticism is that when the resulting rules are run, it becomes clear that theygenerate many more nodes than are used. These extra nodes serve a function as modulesto be skipped but it would surely be better if variations in skip size were set in the skipsizes than by inserting redundant nodes that make up most of the ‘network’; that is, the(skip size) representation that was intended is not the one being used and so should eitherbe removed or changed.A→B0B0BB > B→[CD]B→CC < D→CD > D→C1Production rules⇒ ⇒ ⇒AAThe rewriting process and resulting networksBBBB0B0BCCCDD[CD]0[CD]0CCCCCC[CC1]0[CC]0C14IV CENTRAL IDEAS OF THE PROJECTThis part (part IV) of the project sets out the key conclusions reached during study of thearea, as set out in parts I to III, before the design of the main experiment was started.9 Artificial Intelligence calls for EvolutionaryEmergenceThere are two ways in which we can proceed towards an artificial intelligence of thehuman/ ant (as opposed to computer chess) variety:1) try to understand enough about intelligence to be able to program it in2) try to build a device which outperforms the specifications we give it, that is intelligent.The first is the approach of traditional artificial intelligence. I believe that much progress,in terms of useful software with intelligent aspects, will be made in this way. But I agreewith many others in the AI community that we cannot conceivably go all the way withthis approach. Therefore we are left with option 2.Evolutionary Emergence, from genetic algorithms, seems to be the most promising routeto building a device which outperforms the specifications that we give it.10 Evolutionary Emergence calls for thedismissal of Explicit Fitness functionsNatural evolution has no (explicit) evaluation functions. Individuals simply live untilthey die, reproducing, or not, during this time. As a result of organism-environmentinteractions, including interactions between similarly-capable organisms, certainbehaviours fare better than others. This is how individuals are `evaluated` and how thenon-random cumulative selection works without any long-term goal. It is also why newabilities can emerge.If we want to achieve this level of emergence in our artificial evolution, then we muststop treating evolution as a search for the optimum organism with evaluation accordingto some guess at what that involves.1511 Evolutionary Emergence of AI calls forDevelopmental ModularityThe evolutionary emergence of (better than very low-level) intelligence requires newneural structures. We can expect all (or at least most) neural structures to be descendedfrom neural structures which once had a different (non-trivial) use.For a neural structure to be used in a novel way (possibly after further evolution) and yetstill perform its current role(s), a duplication system is required. This could be eithergene duplication or part of a modular developmental process.Gene duplication can be rejected as a sole source of neural structure duplication, becausethe capacity required to store all specific connections in a large network without amodular coding is infeasible.Therefore, we come to the conclusion that for the effective evolutionary emergence ofintelligence, a modular developmental process is called for.12 Developmental Modularity calls for L-systemsA modular development strategy should be able to use a module in a large number ofspecific situations (rather than just in an iterative fashion).It makes intrinsic sense that the development of part of an organism should depend partlyon its local environment, including itself, and not on remote parts of the environment.Lindenmayer systems can have both of these fundamental properties. Indeed, using themost general notion of an L-system, it is possible that all systems with these propertiesare examples of L-systems.17Whenever a new node is created, it is set as the next network input node if it has noinputs, the next network output node if it has no outputs and neither if it has either. Oncea network input or output has been allocated, it is fixed. If that node is later deleted, thena new node cannot take its place; the network input or output would simply always be 0on that input/ output. This method of allocation of network inputs and outputs ensuresthat the addition or removal of a network input/ output node at a later stage ofdevelopment will not wreck the numbering of previously set network inputs and outputs.14 Genetic Encoding of the Production RulesThe genetic encoding (or rather decoding) is loosely similar to that used by Boers andKuiper. For every bit of the genotype, an attempt is made to read a rule that starts on thatbit. A valid rule is one that starts with ‘11’ and has enough bits after it to complete arule; the number of bits varies because the node-characters can be of any length.To read a rule, the system uses the idea of ‘segments’. A segment is a bit string with itsodd-numbered bits (1st,3rd,5th,...) all 0. Thus the reading of a segment is as follows:read the current bit; if it is a 1 then stop; else read the next bit - this is the nextinformation bit of the segment; now start over, keeping track of the information bits ofthe segment. Note that a segment can be empty (have 0 information bits).The full procedure to (try to) read a rule is to read a segment for each of the left context,predecessor, right context, successor 1 (replacement node) and successor 2 (new node).Then, if possible, the six ‘links details’ bits are read. Only if all this is achieved beforethe end of the genotype is a rule created.For a context, the first information bit of a segment is made the link type bit and the anyremaining bits are made the node character specification.Figure 6: Example of the genome decoding method used in this projectIn fact, this genotype encodes a valid XOR network - see appendix 3.GENOTYPE:1 0 1 1 1 1 0 1 1 0 1 1 0 0 1 0 1 1 1 0 0 0 0 0Decoding: +++ < > _1_ ->_1_ * _0_ * 0 1 1 1 0 0+++ < _1_ > _1_ ->_0_ * _1_ * 1 0 0 0 0 0Rules: < > 1 , → 1 ,0 0 ,1 ,1 ,1 ,0 ,0< 1 > 1 , → 0 ,1 1 ,0 ,0 ,0 ,0 ,0(Template:Lt,Lc< P > Rt,Rc→ Sr,Sn, b1,b2,b3,b4,b5,b6)1815 Initial Experiment to verify the system so farA fairly conventional distributed genetic algorithm was used to evolve these L-systems(producing ANNs as above). Fitness was calculated as -1 * (sum of errors upon testingon XOR with inputs 00,01,10,11). Thus the worst possible fitness (using the ANNs ofthis project) was -4, the best 0 and a network that produces no output has fitness -2.The GA used a population of size just 64 distributed over a 3-dimensional torus of length4; each position on the torus always being occupied. An individual is selected atrandom, then 2 of its neighbours. The fitter neighbour is selected as the other parent andthe weaker neighbour is replaced by the child (produced using variable length crossoverand mutation, described in chapter 17). Further details are not important as anyconventional GA would have produced similar results.After the genotype lengths have increased enough to form random rules (see chapter 17),an optimum network (that is rules encoding an XOR network) is soon found. Seeappendix 3 for the output of a sample run of this experiment. The code for theexperiment is in appendix 4.It is worth noticing that after a solution has been found, the GA goes on to find solutionswith more rules, which do not decrease the fitness of the individual.19VI THE MAIN EXPERIMENT16 Details of the Individuals being evolvedThe individuals being evolved are recurrent ANNs of the type described in section 3.3.Their genotypes are decoded into L-system production rules, as described in part V.Individuals also have their position in the (artificial) world and the direction they arefacing associated with them.The first five network outputs have pre-programmed functions associated with them:network output 1:try to reproduce with an organism directly in front;network output 2:try to kill an organism directly in front;network output 3:turn anti-clockwise by ninety degrees;network output 4:turn clockwise by ninety degrees;network output 5:try to move one space forward.See the next chapter for details of the functions and network input and output.2017 Outline of the Genetic Algorithm usedThe main point to notice about the following description of the GA is that no (explicit)fitness function is used. The hope was to see emergent intelligent (and other) behaviours,such as individuals only trying to mate with others of the same species, only killing thoseof other species and acting in groups where beneficial.The genetic algorithm (GA) operates on a population distributed over a two-dimensionaltorus. Population sizes used in tests were normally in the hundreds, sometimesthousands.InitialisationEvery space in the world has an individual with genotype ‘0’ put in it.The Main LoopThe population is passed over in cycles. In each cycle, every individual alive at the startof the cycle will be processed once. The order in which the individuals are processed israndomised. This random cycle method prevents some of the emergent properties thatwere found to dominate the evolution (and so prevent more interesting emergentproperties) in runs using a simpler method.To process an individual, first its network inputs are set equal to the network outputs ofthe individual in front of it. This will depend on the direction the individual is facing.To be more specific: for each network input node, if the individual in front has acorresponding network output node (see morphogenesis details) then the network inputvalue is copied from there, otherwise the network input value is set to zero.The activities of the individual’s nodes are now updated, with just one synchronous step.Thus the fastest possible response to an input would involve a link directly from the inputnode to an output node. The more links involved (including any recurrent loops), themore time cycles required.Depending on the (excitatory) outputs of the ANN, the individual now carries out theappropriate pre-programmed actions:If there is a number 1 network output node producing sufficient excitatory output,the individual tries to reproduce with an organism directly in front of it;If there is a number 2 network output node producing sufficient excitatory output,the individual tries to kill an organism directly in front of it;If there is a number 3 network output node producing sufficient excitatory output,the individual turns anti-clockwise by ninety degrees;If there is a number 4 network output node producing sufficient excitatory output,the individual turns clockwise by ninety degrees;If there is a number 5 network output node producing sufficient excitatory output,the individual tries to move one space forward.21Note that, for example, there need not be a number 1 network output node for there to bea number 2 - see morphogenesis section.Any further network outputs do not produce any pre-programmed action, although theyare copied to other individual’s network inputs at the same time as the first five networkoutputs.An individual can only reproduce with or kill an organism directly in front of it if there isone there. It can only move one space forward if there is nothing there. If reproductionoccurs, then the child is born in the space beyond the individual being mated with if it isempty, otherwise the child replaces the individual being mated with.Reproduction involves crossover and mutation followed by morphogenesis:CrossoverThe parent chromosomes (genotypes) are randomly ordered, so that either might form thebeginning of the child’s chromosome. A cut point is chosen somewhere along the firstparent chromosome and the corresponding cut point is set on the second. There is aprogrammed in twenty five percent probability that the second chromosome’s cut pointwill be offset by one gene, either upstream or downstream, from the first chromosome’scut point. Strict crossover is enforced: the first cut point must be after the first gene andthe cut point for the second parent chromosome is set to just before its last gene if it wasafter it. This is what provides the main force for initial increase in genotype length(before any L-system rules are formed) but becomes negligible once long genotypelengths have been achieved.The child chromosome is copied from these two sections of the parents’ chromosomes.MutationOne gene (one bit) is picked at random on the genotype, and flipped.MorphogenesisWhen (before) an organism is born, it undergoes morphogenesis. This is currently set asfour morphogenesis steps. Each morphogenesis step is a synchronous pass over theANN’s nodes, using the most specific rule from its L-systems for each node. The initialdirection the organism is facing is set at random (from north, east, south, west).Whilst I would have liked to use more morphogenesis steps, and indeed this does notseem too slow the program down much, I found it very difficult to understand what washappening in a recurrent ANN of more than 32 nodes (as often occurs with 4 steps: 2nodes from the axiom network repeatedly dividing 4 times results in 32 nodes) as it was -see appendix 1. Similarly the only problem with using development during anindividual’s life (that is not all at birth) was that this also made it harder to understandwhat was going on.2218 ImplementationThe world, a two dimensional torus, is shown in the main window as a square, along withthe number of passes over the population that have been performed (the ‘time’). Anindividual is shown as an isosceles triangle in the world, the direction it is facing beingindicated by the orientation of the triangle. See appendix 1. Most runs used a world of‘side-length’ 20, which thus contains up to 400 individuals. Worlds with thousands ofindividuals were sometimes used, but these incurred correspondingly slow run times fora time-cycle, which is clearly of the order side-length squared.In this way it is possible to view the externals of what is going on in the world. To viewthe internals of an individual, the user simply clicks on the individual to display it in aseparate window. This has been set up to temporarily halt the main program, allowingthe user to click on and view more individuals, in more windows, before the worldchanges. The main program continues once all the other windows have been closed.A view of an individual includes its genotype, L-system rules and ANN, post-morphogenesis. Initially the ANN is displayed with input nodes near the bottom of thewindow, output nodes near the top and all others in the middle. To make the networkclearer, the user can move the nodes around the window by dragging and dropping them.Picking up nodes is made easier by the use of a hidden grid, which all nodes and mouse-click-positions are snapped to. Links and labels are automatically updated as the nodesare dragged around the window.Despite this, networks are still very difficult to understand - see appendix 1. The numberof links in most interesting individuals made it hard enough to examine them on a goodscreen, where nodes can be repeatedly moved around, and impossible on a printout;hence only 1 is included in this dissertation.Note: for documentation reasons, a right-click within either the world window or anindividual-view window has been programmed to write a picture metafile, containing theinstructions to draw the window’s contents, to the operating system’s clipboard.2319 ResultsCareful and repeated examination of the world and the networks within it has revealedthe following as a ‘typical’ run of the evolution:[I cannot be sure that all of these happen every time but most are easy to spot and doseem to appear in at least most runs.]1) Excitatory activity builds up in the axiom networks until it passes the decisionthreshold in some individuals, at which point reproduction (random) begins. Thishappens too fast to see at execution speed; a trace must be used to observe it.2) Breading continues randomly until genotype lengths have increased to a lengthcapable of containing a valid production rule.3) Breeding is no longer totally random. Initial successful individuals include thosethat spin around, trying to reproduce all the time (spinning brings them intocontact with more individuals) and those that spin around, killing all the time.4) Networks that have links to the ‘move forward’ node appear in the population.Individuals that ‘kill and move forward’ all the time do well, killing off some ofthe spinner-killers of 3.5) Combinations of the basic individuals appear. For example individuals thatalways try to move forward, turn some of the time, try to kill some of the time,trying to reproduce when they are not killing, do well.7) Up till now, much of the activation in an individual’s network has come in via itsfirst few nodes, something common to most of the population from early on.Such individuals rely on noise when not near others, to keep their activations upin order to act. Now, however, killers are making the population fewer in numberand there are many blank areas in the world. Around now, individuals withrecurrent excitatory-link-loops within them appear. These loops in effect supplyan activation store, allowing the individuals to be less random in their behaviourwhen in empty areas. I have observed this a few times.8) Towards the end, I sometimes think (but am not sure) that some of the networksare using outputs and inputs above number 5 to recognise each other anddetermine their actions. However, such networks have so many links that Icannot be sure that this is really true. It certainly looks as though they canrepeatedly bump into similar individuals without killing them and yet run aroundkilling most of the other individuals in the world. Curiously, they do not seem toreproduce as much as they kill.9) The end usually comes very rapidly, when very good killers annihilate most of theworld and then themselves get killed by (I presume) a stationary rotator-killer. Afew such stationary individuals remain but have no possibility of moving(because they do not have the relevant output node) and so no possibility ofevolving.24VII CONCLUSIONS AND SUGGESTIONS FORFURTHER WORK20 ConclusionsThis project has been a ‘probable success’. That is, it appears that the arguments usedhave been proved correct by the main experiment; some of the observed behaviourscould indeed be considered intelligent, if only at a very low level. However, there is toomuch uncertainty about the more impressive results, such as individuals using spareoutputs and inputs to recognise each other. More time is required to verify the greaterclaims.If nothing else, I believe that I have created a good base from where to continue, whichwas my personal aim on starting the project - I plan to continue with this and relatedwork. Looking back, it is possible that I have attempted too much and so not dealt withsome of the theoretical issues, such as the genetic encoding of production rules, in thedepth they deserve.21 Suggestions for Further WorkThe first problem that I think should be tackled is that of how to examine the system insuch a way as to be confident of what is going on within the ANNs.Then I think that some of the theory should be ‘firmed up’. A fair amount of this workhas been based on experimental results, both my own and other peoples. I believe thatsome of this, especially the genetic encoding of production rules, are open to a theoreticalapproach.Further issues include making development take place over an individual’s lifetime andthe incorporation of lifetime learning. I would also like to put the few remaining hard-specified parameters under evolutionary control. However, I see these issues assecondary.25BIBLIOGRAPHY[Boers and Kuiper 1992] E.J.W. Boers and H. Kuiper. Biological metaphors and the design of modularartificial neural networks. Unpublished joint Master’s thesis, departments of Computer Scienceand Experimental Psychology, Leiden University, The Netherlands.[Boers et al. 1993] E.J.W. Boers, H. Kuiper, B.L.M.Happel and I.G.Sprinkhuizen-Kuyper. Designingmodular artificial neural networks. Technical report, Leiden University.[Boers et al. 1995] E.J.W. Boers, M.V. Borst and I.G. Sprinkhuizen-Kuyper. Evolving artificial neuralnetworks using the "Baldwin Effect". In D.W. Pearson, N.C. Steele and R.F. Albrecht, editors,Artificial Neural Nets and Genetic Algorithms: Proceedings of the International Conference inAlès, France, 333-336. Springer Verlag Wien New York.[Brooks 1991a] R.A. Brooks. Intelligence without representation. Artificial Intelligence, 47: 139-159.[Brooks 1991b] R.A. Brooks. Intelligence without reason. In Proceedings of the Twelfth InternationalJoint Conference on Artificial Intelligence (IJCAI-91) 139-159.[Brooks 1992] R.A. Brooks. Artificial life and real robots. In Proceedings of the First EuropeanConference on Artificial Life. MIT Press/ Bradford Books, Cambridge, MA.[Cariani 1991] P. Cariani. Emergence and Artificial Life. In C. G. Langton, J.D. Farmer, S. Rasmussenand C. Taylor, editors, Artificial Life II, Santa Fe Institute Studies in the Sciences of Complexity,Vol. X, 775-797. Addison-Wesley.[Clark 1994] A. Clark. Happy couplings: emergence and explanatory interlock. Department ofPhilosophy, Washington University in St. Louis.[Cliff et al. 1992] D. Cliff, I. Harvey and P. Husbands. Incremental evolution of neural networkarchitectures for adaptive behaviour. Technical report CSRP256, University of Sussex School ofCognitive and Computing Sciences.[Cliff et al. 1993] D. Cliff, I. Harvey and P. Husbands. Explorations in evolutionary robotics. AdaptiveBehaviour, 2(1): 73-110.[Cliff et al. 1996] D. Cliff, I. Harvey and P. Husbands. Artificial evolution of visual control systems forrobots. To appear in M. Srinivisan and S. Venkatesh, editors, From Living Eyes to SeeingMachines. Oxford University Press, in press 1996.[Crutchfield 1994] J.P. Crutchfield. Is Anything Ever New? - Considering Emergence. In G. Cowan, D.Pines and D. Melzner, editors, Santa Fe Institute Studies in the Sciences of Complexity XIX.Addison-Wesley.[Dawkins 1986] R. Dawkins. The Blind Watchmaker. Longman, Essex.[de Garis 1993] H. de Garis. CAM-BRAIN. Report, Brain Builder Group, Evolutionary SystemsDepartment, ATR Human Information Processing Research Laboratories, Kansai Science City,Kyoto, Japan.[Elias 1992] J.G. Elias. Genetic Generation of Connection Patters for a Dynamic Artificial NeuralNetwork. In L.D. Whitley and J.D. Schaffer, editors, Proceedings of COGANN-92: InternationalWorkshop on Combinations of Genetic Algorithms and Neural Networks, 38-54, IEEE ComputerSociety Press, Los Alamitos, CA.[Goldberg 1989] D.E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning.Addison-Wesley, Reading, Massachusetts, USA.[Golberg et al. 1990] D.E. Goldberg, K. Deb, and B. Korb. An investigation of messy genetic algorithms.Technical report TCGA-90005, TCGA, The University of Alabama.[Gruau 1992] F. Gruau. Genetic synthesis of boolean neural networks with a cell rewriting developmentalprocess. In L.D. Whitley and J.D. Schaffer, editors, Proceedings of COGANN-92: InternationalWorkshop on Combinations of Genetic Algorithms and Neural Networks, 55-74, IEEE ComputerSociety Press, Los Alamitos, CA.[Gruau 1996] F. Gruau. Artificial cellular development in optimization and compilation. Technical report,Psychology department, Stanford University, Palo Alto, CA.[Harvey et al. 1992] I. Harvey, P. Husbands and D. Cliff. Issues in Evolutionary Robotics. Technicalreport CSRP219, School of Cognitive and Computing Sciences, University of Sussex. Also inJ.A. Meyer, H. Roitblat and S. Wilson, editors, Proceedings of SAB92, the Second InternationalConference on Simulation of Adaptive Behaviour. MIT Press/ Bradford Books, Cambridge, MA,1993.[Harvey 1992a] I. Harvey. Species Adaptation Genetic Algorithms: A basis for a continuing SAGA.Technical report CSRP221, School of Cognitive and Computing Sciences, University of Sussex.26[Harvey 1992b] I. Harvey. Evolutionary Robotics and SAGA: The case for hill crawling and tournamentselection. Technical report CSRP222, School of Cognitive and Computing Sciences, Universityof Sussex.[Harvey 1992c] I. Harvey. The SAGA Cross: The mechanics of recombination for species with variable-length genotypes. Technical report CSRP223, School of Cognitive and Computing Sciences,University of Sussex.[Hinton and Nowlan 1987] G.E. Hinton and S. Nowlan. How learning can guide evolution. ComplexSystems, 1:495-502.[Holland 1992] J.H. Holland. Genetic Algorithms. Scientific American, 267(1):44-50.[Holland 1993] J.H. Holland. Adaptation in Natural and Artificial Systems. MIT Press.[Kitano 1990] H. Kitano. Designing neural networks using genetic algorithms with graph generationsystem. Complex Systems, 4, 461-476. Champaign, IL.[Koza 1990] J.R. Koza. Genetic programming: A paradigm for genetically breeding populations ofcomputer programs to solve problems. Technical report STAN-CS-90-1314, Department ofComputer Science, Stanford University[Koza 1992] J.R. Koza. Genetic Programming. MIT Press/ Bradford Books, Cambridge MA.[Lindenmayer 1968] A. Lindenmayer. Mathematical models for cellular interaction in development, parts Iand II. Journal of theoretical biology, 18, 280-315.[Mandelbrot 1982] B.B. Mandelbrot. The Fractal Geometry of Nature. Freeman, San Francisco.[Maynard Smith1987] J. Maynard Smith. When Learning Guides Evolution. Nature, 329:761-762.[Miller at al. 1989] G. Miller, P.M. Todd and S.U. Hegde. Designing Neural Networks using GeneticAlgorithms. In J.D. Schaffer, editor, Proceedings of the third International Conference on GeneticAlgorithms, 379-384, Kaufmann, San Mateo, CA, 1989.[Mitchell 1993] M. Mitchell and S. Forrest. Genetic algorithms and artificial life. Working Paper 93-11-072, Santa Fe Institute.[Murre 1992] J.M.J. Murre. Learning and Categorization in Modular Neural Networks. HarvesterWheatsheaf, 1992.[Nolfi and Parisi 1995] S. Nolfi and D. Parisi. Learning to adapt to changing environments in evolvingneural networks. Technical report 95-15, Department of Neural Systems and Artificial Life,Institute of Psychology, National Research Council, Rome, Italy.[Peretto 1992] P. Peretto. An Introduction to the Modeling of Neural Networks. Cambridge UniversityPress.[Purves et al. 1992] D. Purves, D.R. Riddle and A-S LaMantia. Iterated patterns of brain circuitry (or howthe cortex gets its spots). Trends In Neuroscience, pages 362-367, Vol. 15, No. 10, 1992.[Rumelhart et al. 1992] D.E. Rumelhart, J.L. McClelland and the PDP Research Group. Editors, ParallelDistributed Processing: Explorations in the Microstructure of Cognition, Vol.1 (tenth printing),194-281.[Sannier and Goodman 1987] A.V. Sannier II and E.D. Goodman. Genetic learning procedures indistributed environments. Technical report, A.H. Case Center for Computer-Aided Engineeringand Manufacturing, Michigan State University.[Smolensky 1992] P. Smolensky. Information processing in dynamical systems: Foundations of harmonytheory. In D.E. Rumelhart, J.L. McClelland and the PDP Research Group, editors, ParallelDistributed Processing: Explorations in the Microstructure of Cognition, Vol.1 (tenth printing),194-281.[Steels 1994] L. Steels. The Artificial Life roots of Artificial Intelligence. Cited in [Clark 1994]. Not readby author of this project.[Stork et al. 1991] D.G. Stork, B. Jackson and S. Walker. ‘Non-optimality’ via pre-adaptation in simpleneural systems. In C. G. Langton, J.D. Farmer, S. Rasmussen and C. Taylor, editors, ArtificialLife II, Santa Fe Institute Studies in the Sciences of Complexity, Vol. X, 409-429. Addison-Wesley.