Evolutionary robotics is a field of research that employs evolutionary computation to generate robots that adapt to their environment through a process analogous to natural evolution. The generation and optimisation of robots are based on evolutionary principles of blind variations and survival of the fittest, as embodied in the neo-Darwinian synthesis (Gould, 2002).

Evolutionary robotics is typically applied to create control system for robots. Although less frequent, evolutionary robotics can also be applied to generate robot body plans, and to coevolve control systems and body plans simultaneously (Lipson and Pollack, 2000). In this respect, evolutionary robotics differs from the Artificial Life domain in the usage of physical robots. In particular, evolutionary robotics puts a strong emphasis on embodiment and situatedness, and on the close interaction of brain, body, and environment, which is crucial for the emergence of intelligent, adaptive behaviour and cognitive processes (e.g. Clark, 1997; Chiel & Beer, 1997; Nolfi & Floreano, 2002).

Evolutionary robotics is organised along two axes of research: one concerned with cognitive science (Harvey et al., 2005) and biology (Floreano and Keller, 2010); the other focused on using evolutionary robotics techniques for engineering purposes (Silva et al., 2016), with the long-term goal of obtaining a process capable of automatically designing and maintaining an efficient robotic system. Evolutionary robotics is a highly general approach, as it enables the synthesis of control or body plans given only a specification of the task, and is not tied to specific evolutionary algorithms, control systems, or types of robots (Bongard et al., 2006; Cully et al., 2015).

Basic framework

Figure 1: Schematic illustration of combined body plan and control system specification, as described in Lipson and Pollack, 2000. Evolution can connect parts to each other to form arbitrary trusses (body plan), and can connect neurons to each other via synapses to form arbitrary neural networks (control systems). Neurons also connect to bars: in the same way that a real neuron governs the contraction of muscle tissue, the artificial neuron controls the length of the bar via a linear actuator. No sensors are used. As a result, these robots can generate patterns of actions, but cannot directly sense their environment.

Similarly to more traditional evolutionary computation approaches, evolutionary robotics techniques operate with a population of candidate solutions or genomes. Each genome in the population encodes a number of parameters of one or more robots’ body plan or control system, the phenotype (see Fig. 1). If the genome describes a body plan, genome parameters can determine whether body parts should be added to the body plan, or configure parts of the body plan, such as the angle and range for specific joints (Bongard, 2011). If the genome encodes an artificial neural network-based controller, the synaptic weights can be represented as a real-valued vector at the genome level.

Evolutionary robotics techniques can be applied either offline, in simulation, or online, that is, on the physical robots while they operate in the task environment. In offline evolution, controllers are evolved in simulation for a certain number of generations or until a certain performance criterion is met, and then deployed on real robots. In online evolution, on the other hand, an evolutionary algorithm is executed on the robots themselves as they perform their tasks. Evolution is most commonly applied offline for a number of practical and strategic reasons. Firstly, offline evolution is typically less time consuming than online evolution, although suitable simulation environments have to be developed beforehand. Secondly, offline evolution allows the researcher to concentrate on developing the body plan or control method without having to address issues inherently associated with physical robots, such as wear and tear, potential damage, calibration drift, and so on.

Guiding the search in evolutionary robotics

The experimenter applying evolutionary robotics techniques often relies on a self-organisation process in which evaluation and optimisation are holistic, thereby eliminating the need for manual and detailed specification of the desired body plan or control system.

Fitness-based evolution

Similarly to other evolutionary methods, a traditional evolutionary robotics process requires only sparse feedback, which is given by a measure of overall performance, that is, a fitness score. The fitness function is therefore at the heart of multiple evolutionary robotics processes and rewards improvement towards a task-dependent objective, a metaphor for the pressure to adapt in nature. According to the three-dimensional fitness space framework proposed by Floreano and Urzelai (2000), a fitness function can be classified along the following three axes:

Functional or behavioural, depending on whether the fitness function measures the components involved in the generation of the behaviour, such as the rotational speed of a robot’s wheel in a navigation task, or the effects of the behaviour, such as the distance covered by a robot in a given amount of time.

External or internal, depending on whether the fitness computation is based on the measurement of variables that are only available to an external observer with access to precise information, such as the number of clusters formed by the robots in an aggregation task, or on information directly available to the robot through onboard sensor readings.

Explicit or implicit, depending on the quantity and nature of constraints explicitly imposed in the working principles of solutions. The higher the number of components, the more explicit the fitness function and the more constrained the behaviour is.

As discussed in Nolfi and Floreano (2000), a behavioural-implicit-internal approach may be more adequate to design adaptive robots, because it imposes less restrictions on the evolutionary process, which in turn can lead to more efficient and effective behaviour adaptation.

Novelty-based evolution

A class of methods has recently emerged in evolutionary robotics in which evolution is driven by novelty or behavioural diversity instead of a typical fitness function, see Fig. 2. That is, candidate solutions are rewarded for displaying behaviours that differ from previously evolved behaviours, rather than according to a predefined performance objective. This open-ended search process was initially formalised in the novelty search algorithm (Lehman and Stanley, 2011), and triggered a significant body of work in evolutionary robotics, including new perspectives on how to potentially avoid premature convergence-related issues (Lehman and Stanley, 2011; Lehman et al., 2013; Mouret and Doncieux, 2012).

As an example, consider the evolution of a control system for a maze-navigating robot. The fitness function can be defined based on how close the robot gets to the goal, which intuitively describes the task to solve. However, mazes with obstacles that prevent a direct route may cause the fitness function to deceive evolution. If the candidate solution is instead characterised in behavioural terms, such as the robot’s final position in the maze (see Fig. 3), searching for novel behaviours has the potential to avoid the deception associated with the fitness functions in a number of tasks.

Figure 2: In traditional evolutionary algorithms, genomes of candidate solutions are translated into phenotypes whose fitness is assessed. In evolutionary robotics, the behaviour of the robot can also be characterised during task execution, which enables evolution to be guided by the search for novel or diverse behaviours instead of fitness.

Figure 3: A maze with obstacles that prevent a direct route from the robot to the goal. The robot has to navigate from the start position (marked with "Start") to the goal (indicated by "G"). Red dots are examples of final positions (behaviour characterisations) of candidate solutions. Solutions that cause a robot to end up in sparse regions of the maze (the behaviour space) are considered novel..

Combining novelty and fitness

In recent years, numerous approaches have been introduced to combine the respective advantages of fitness-based evolution and of novelty-based evolution, to obtain more effective optimisation procedures (Cuccu and Gomez, 2011; Lehman and Stanley, 2010; Mouret and Doncieux, 2009; Mouret and Doncieux, 2012).

Interestingly, recent contributions have introduced a new class of approaches called Quality Diversity algorithms (Lehman and Stanley, 2011b; Cully and Mouret, 2013; Pugh et al., 2015) or Illumination algorithms (Mouret and Clune, 2015); for a comprehensive review of Quality Diversity algorithms see Pugh et al. (2016). The goal of Quality Diversity algorithms is to discover a wide range of diverse candidate solutions, but where each candidate solution can also be optimised for performance, that is, with respect to a measure of quality. Quality Diversity algorithms have originated with the novelty search with local competition algorithm (NSLC; Lehman and Stanley, 2011b), a multiobjective formulation of novelty and fitness, but in which the fitness objective is changed from being a global measure to being one relative to a local neighbourhood of behaviourally similar individuals. The working principles of NSLC have inspired different algorithms, one of the most adopted being the MAP-Elites algorithm (Mouret and Clune, 2015). Given a behaviour characterisation with N dimensions, MAP-elites first transforms the behaviour space into discrete bins according to a user-defined granularity level, and then tries to find the highest-performing individual for each point in the discretised space in order to construct a behaviour-performance map.

Early pioneering contributions

Evolutionary algorithms were the subject of significant progress since the introduction of the concept of evolutionary search and initial discussions on the potential of machine intelligence by Turing (1950). The possibility of evolving robots was later evoked by the neurophysiologist Valentino Braitenberg (Braitenberg, 1984) in his thought-experiment on the creation of new robot designs. Almost a decade later, the field of evolutionary robotics began to develop. Pioneering groups at the University of Southern California, US (Lewis et al., 1992), at the Swiss Federal Institute of Technology in Lausanne, Switzerland (Floreano and Mondada, 1994, 1996), at Sussex University in the UK (Harvey et al., 1994), and at the Italian National Research Council (Nolfi et al., 1994) laid the foundation for a number of important studies that followed.

Lewis et al. (1992) evolved neural network controllers for a real six-legged robot by synthesising the controllers on a workstation, and downloading each controller to the real robot for performance evaluation. Each evaluation required a human observer to monitor and score the performance of the real robot. Complementarily, Floreano and Mondada (1994,1996) experimented with evolution of controllers directly in real robotic hardware, including navigation and homing behaviours for a Khepera robot. Given the low computational power of the Khepera robot, the actual evolutionary computation was performed on a workstation. The synthesis of successful controllers required up to ten days of continuous evolution. Harvey et al. (1994) evolved controllers for a real Gantry robot, and demonstrated principled approaches to the evolution of visually-guided robot behaviour (e.g. navigation and shape discrimination tasks), namely concurrent evolution of sensorimotor features and control systems. Finally, Nolfi et al. (1994) proposed evolving controllers in simulation and continuing evolution for a few generations in real hardware if a decrease in performance is observed when controllers were transferred.

The pioneering contributions facilitated progress and cross-fertilisation of ideas between different robotics domains. For example Brooks (1992), a pioneer in behaviour-based robotics, acknowledged the potential of evolutionary robotics techniques for control synthesis, and argued for different research avenues that are still being pursued, such as making evolution aware of regularities in morphological structure (e.g. symmetric sensor placement) and enabling to mirror them in the control structure (Silva et al., 2016). In summary, early pioneering contributions highlighted the potential of evolutionary robotics and gave rise to a number of different research directions, which we summarise below.

Main research directions

Gait evolution

From the early years of evolutionary robotics, wheeled robots have been widely used in the field. Evolutionary robotics has nevertheless enabled synthesis of control for robots with varying morphologies, such as legged robots (Gong et al., 2010). Legged robots have significant potential because they can access types of terrain unsuitable for wheeled robots. In this respect, evolved gaits have outperformed engineered gaits in different situations (Yosinski et al., 2011), and have even been included in commercial products such as the first version of Sony’s AIBO robot (Hornby et al., 2005).

Damage recovery

One of the main research topics in legged robots is the ability to recover from damage to one or more legs. Bongard et al. (2006); Bongard (2009) introduced a damage recovery approach based on three evolutionary algorithms. The approach is implemented according to an onboard combination of simulation and evolution. The first algorithm optimises a population of physical simulators in order to more accurately model the real environment. The second algorithm then creates exploratory behaviours for the real robot to execute so as to collect new training data for the first algorithm. Finally, the third algorithm uses the best simulator to evolve locomotion behaviours for a real quadruped robot. Besides increasing the number of successful controllers, the combination of the three evolutionary algorithms yields an important advantage: enabling the robot to recover from unanticipated situations such as physical damage to one of its legs (Bongard et al., 2006). The working principles of Bongard’s approach has fostered the development of novel approaches such as the intelligent trial-and-error algorithm (Cully et al., 2015), in which a detailed map of high-performing behaviours is constructed in simulation via the Quality Diversity algorithm MAP-Elites (Mouret and Clune, 2015) and then deployed on the real robot. During task execution, if performance drops below a user-defined threshold due to, for instance, physical damage to the robot’s body or changes in the environmental conditions, the robot can iteratively select a promising behaviour from the map, test it, and measure its performance until a suitable behaviour is chosen.

Control for multirobot systems

Control for robot collectives is typically challenging to design by hand because there is no general approach to derive the behaviour for individual robots based on a desired global behaviour or task description. In this respect, evolutionary robotics techniques have also been applied to evolve decentralised control for robot collectives. Quinn et al. (2003) were among the first to demonstrate the potential of evolutionary robotics, as they successfully evolved coordinated, cooperative behaviours for multirobot systems. A group of three robots was evolved to perform a formation-movement task without losing contact with each other, equipped only with minimal infrared sensors. After an initial coordination phase, different roles emerged depending on the relative position of robots and their history of interactions. Shortly after, Nelson et al. (2004) presented another notable example of collective behaviour evolution by having teams of real mobile robots playing a robotic version of the game Capture the Flag. Each team defended its own goal while trying to 'attack' the opposing team’s goal. Robot controllers relied entirely on processed video data for sensing the environment. More recently, an approach called multiagent HyperNEAT (D’Ambrosio et al., 2010; D’Ambrosio et al., 2011; D’Ambrosio and Stanley, 2013) has made it possible to represent controllers for groups of robots: (i) as a function of the control policy geometry, that is, the relationship between the role of the robots and their position in the group, which allows to dynamically change the group size without further evolution, and (ii) the situational policy geometry, which enables each robot to have multiple control policies and switch between them depending on the robot’s state. However, multiagent HyperNEAT requires the number of policies to be specified by the experimenter, and assumes that there is a geometric relationship between different policies (e.g. advancing and retreating are geometric opposites). A recent variation of HyperNEAT called multibrain HyperNEAT (Schrum et al., 2016) has been introduced as a potential solution to evolving multiple control policies without assuming geometric relationships between them.

Throughout the years, evolutionary robotics techniques were applied to a number of different tasks such as hole avoidance (Trianni et al., 2006), collective transport of objects (Groß and Dorigo, 2009), self-assembly (Ampatzis et al., 2009), coordinated motion (Sperati et al., 2008), and chain formation (Sperati et al., 2011). However, such studies were typically carried out either in simulation or in highly controlled environments such as small and enclosed arenas in laboratories. In a demonstration of evolutionary robots operating in a real and uncontrolled environment, Duarte et al. (2016) evolved control for a swarm of aquatic surface robots to execute common collective tasks, namely homing, clustering, dispersion, and area monitoring, and then composed the controllers for each task to carry out a complete environmental monitoring task.

Body plan evolution

The first study on coevolution of body plans and control systems was carried out by Lipson and Pollack (2000) who, inspired by the experiments of Sims (1994) in the artificial life domain, developed an approach in which both body plans and control for robots were fully optimised in simulation (see Fig. 1). The structure of the fittest robot was produced using additive manufacturing techniques. Stepper motors and microcontrollers were then manually attached to the physical structure and the performance of the robot was assessed. With the advent of new materials and fabrication techniques, less conventional approaches to design robots are emerging. A recent example is soft robotics, in which robots are composed of soft and hard materials. Among the first contributors, Hiller and Lipson (2012) showcased evolutionary design and fabrication of freeform soft robots capable of forward locomotion using soft volumetrically expanding actuator materials. Actuation was provided by the materials periodically varying in volume.

The role of evolutionary robotics in other fields of research

In addition to the use of evolutionary robotics techniques for engineering purposes, evolved robots can also provide insights as to how and why specific traits evolved in natural systems (Floreano and Keller, 2010). Examples include studies on the evolution of communication and signalling (Floreano et al., 2007; Mitri et al., 2010), deception and information suppression in foraging robots with conflicting interests (Mitri et al., 2009), evolvability (Lehman and Stanley, 2013; Wilder and Stanley, 2015), polymorphic mating strategies (Elfwing and Doya, 2014), and evolution of complexity, that is, whether, when, how, and why increased complexity evolved in biological populations (Auerbach and Bongard, 2014).

Open issues in evolutionary robot engineering

Arguably, the main axes of research in evolutionary robotics is the engineering of control systems. In this respect, researchers have been consistently faced with a number of issues (Silva et al., 2016), namely:

The reality gap (Jakobi, 1997), which manifests itself when controllers evolved in simulation prove ineffective on the physical robots. Potential solutions include less formalised approaches such as using samples from the real robots’ sensors in simulation (Miglino et al., 1995) to more formalised approaches such as the transferability approach (Koos et al., 2013), in which the goal is to learn the discrepancies between simulation and reality in order to constrain the evolution of behaviours that do not cross the reality gap.

The prohibitively long time necessary to evolve controllers directly on real robots (Matarić and Cliff, 1996). One way to eliminate the reality gap is to rely exclusively on real robots for controller evolution, which is extremely time-consuming at the current state of development. Potential solutions include embodied evolution (Watson et al., 2002), in which the evolutionary algorithm is distributed across a group of robots that evolve in parallel and exchange genetic information, seeding the evolutionary process with pre-evolved or pre-programmed partial or approximate solutions (Silva et al., 2014a), or the onboard combination of simulation-based evolution and online evolution (Bongard and Lipson, 2004, 2005; De Nardi and Holland, 2008, Bongard et al., 2006; Bongard, 2009; O’Dowd et al., 2011), in which each robot maintains models of the environment and of other robots, and the models are adapted based on differences observed in controller performance between the onboard simulation and reality. However, the performance benefits of such approaches are dependent on encounters between robots, which may be infrequent in large or open environments, on the size of the collective, and on the communication capabilities of the robots.

The bootstrap problem (Nelson et al., 2009) and deception (Whitley, 1991) are issues inherent to the evolutionary approach that drive evolution towards local optima. One solution is to directly assist the evolutionary process, which includes (i) incremental evolution (Mouret and Doncieux, 2008; Christensen and Dorigo, 2006), in which a task is decomposed into different components in a top-down fashion, (ii) behavioural decomposition, in which the robot controller is divided into sub-controllers that are generated separately to solve a different sub-task, and then composed via a second evolutionary process (Moioli et al., 2008, Duarte et al., 2015), and (iii) semi-interactive human in-the-loop approaches (Celis et al., 2013; Woolley and Stanley, 2014). However, both approaches require a large amount of human knowledge. A potential solution is to direct the evolutionary process towards increasing exploration or exploitation of the search space by importing general techniques from evolutionary computation such as multiobjective algorithms. A different alternative is to exploit design for emergence techniques in which behaviour is considered a multi-layer system with different levels of organisation unfolding over different time scales (Nolfi, 2005, 2011; Yamashita and Tani, 2008). In such a system, short-term interactions between a robot and the environment give rise to low-level behaviours, the interaction between lower-level behaviours gives rise to higher-level behaviors, and higher-level behaviours cause changes to the lower-level behaviours and/or the interaction between the constituent elements (control system, body, and environment).

The design of genomic encodings and of the genotype-phenotype mappings that enable the evolution of complex structures (Meyer et al., 1998). The vast majority of evolutionary robotics studies employ direct encoding (Nelson et al., 2009), in which genotypes directly specify a phenotype: each parameter is encoded and optimised separately, which leads to scalability issues. Indirect encodings, on the other hand, allow solutions to be represented as patterns of parameters, rather than requiring each parameter to be represented individually (Bentley and Kumar, 1999; Bongard, 2002; Risi, 2012; Seys and Beer, 2007; Stanley and Miikkulainen, 2003; Stanley 2007; Stanley et al., 2009, D’Ambrosio et al., 2014, Clune et al., 2011). However, indirect encodings are usually biased towards regular structures (e.g. symmetry), which makes it difficult for them to properly account for irregularities such as faults in the joints of four-legged robots (Clune et al., 2011). One solution is to combine indirect encodings with a refining process such as direct encodings by, for instance: (i) evolving with an indirect encoding and then switching to a direct encoding after a fixed, predefined number of generations (Clune et al., 2011), or (ii) evolving genomes composed of an indirect encoding part and a direct encoding part, and allowing evolution to automatically explore multiple encoding combinations (Silva et al., 2015).

The absence of standard research practices in the field. For example, whereas there is an almost unanimous use of computer simulations in evolutionary robotics, there is not a prevalent simulation platform, which makes it difficult to reproduce results and to carry out comparative studies. Evolutionary robotics also suffers from the lack of benchmarks and testbeds. Even though there are multiple “common” tasks, there is no standard implementation of these tasks, meaning that it is currently not possible for researchers to assess an algorithm on a set of task instances. Such instances would be valuable for proofs of concept showing that a given algorithm has enough potential to be further explored, and for studies that analyse the strengths and limitations of a technique on a large number of different tasks.

Acknowledgements

The authors thank Kenneth Stanley and Stefano Nolfi for their constructive feedback and valuable comments.

Bentley, P. and Kumar, S. (1999). Three ways to grow designs: A comparison of evolved embryogenies for a design problem. In Proceedings of the 1st Genetic and Evolutionary Computation Conference, pages 35-43. ACM Press, New York, NY.

Lewis, M. A., Fagg, A. H., and Solidum, A. (1992). Genetic programming approach to the construction of a neural network for control of a walking robot. In Proceedings of the IEEE International Conference on Robotics and Automation, pages 2618-2623. IEEE Press, Piscataway, NJ.

Mitri, S., Floreano, D., and Keller, L. (2009). The evolution of information suppression in communicating robots with conflicting interests. Proceedings of the National Academy of Sciences, 106(37):15786-15790.

Nolfi, S., Floreano, D., Miglino, O., and Mondada, F. (1994). How to evolve autonomous robots: Different approaches in evolutionary robotics. In Proceedings of the 4th International Workshop on Synthesis and Simulation of Living Systems, pages 190-197. MIT Press, Cambridge, MA.

Seys, C. W. and Beer, R. D. (2007). Genotype reuse more important than genotype size in evolvability of embodied neural networks. In Proceedings of the 9th European Conference on Artificial Life, pages 915-924. Springer, Berlin, Germany

Silva, F., Correia, L., and Christensen, A. L. (2014). Speeding up online evolution of robotic controllers with macro-neurons. In Proceedings of the 17th European Conference on the Applications of Evolutionary Computation, pages 765-776. Springer, Berlin, Germany.

Silva, F., Correia, L., and Christensen, A. L. (2015). R-HybrID: Evolution of agent controllers with a hybridisation of indirect and direct encodings. In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems, pages 735-744. IFAAMAS, Richland, SC

Sims, K. (1994). Evolving 3D morphology and behavior by competition. In Proceedings of the 4th International Conference on Simulation and Synthesis of Living Systems, pages 28-39. MIT Press, Cambridge, MA.