Guided self-organization: perception–action loops of embodied systems

In general, self-organization is defined as the transition of a system into an organized form in the absence of external or centralized control. Thus, one may emphasize two key features of a self-organized system or process: (i) an increase in organization (structure and/or functionality) over some time, and (ii) the local interactions are not guided by any external agent. At the first glance, the second feature (the absence of external guidance) immediately places the idea of Guided Self-Organization (GSO) under a serious doubt. The term almost sounds self-contradictory or paradoxical. However, there is a simple resolution of the apparent inconsistency.

Let us first illustrate this with an example provided by studies of optimal path formation within artificial ant colonies. Optimal paths (sometimes a network such as minimal spanning tree; Prokopenko et al. 2005) connecting the nest and some distributed food sources is a well-known outcome of self-organization that involves pheromone-depositing ants. Each individual ant uses only local information, without reference to the global network, and the latter self-organizes after multiple stigmergic interactions between the ants and the environment. The first feature of self-organization—the increase in organization over time—is manifested by such an optimal network (paths). The second feature demanding that the local interactions are not guided by any centralized control or external agent is also evident: every ant acts independently and locally, without tracing some predesigned blueprint.

At this stage, we extend this example by using the results reported by Van Vorhis Key and Baker (1982) who studied odor-conditioned anemotaxis exhibited by Argentine ant workers, Iridomyrmex humilis. They experimented with a specific trail pheromone component that was presented in two ways: as a wide, relatively uniform swath of permeated air, and as a point source creating a time-averaged plume downwind. One of the main observations was that ants traveled significantly farther toward the pheromone source in wind than without wind. That is, the external pressure provided by the additional point source of the specific trail pheromone guided the ants in a particular way. Such guidance, however, was not provided as a control input to individual ants, i.e., without any modifications of the ants’ neural circuitry. One may therefore argue that the resulting optimal paths still appeared as an outcome of self-organization driven by pheromone-depositing ants, and not by any specific blueprint—but at the same time, the paths were affected (guided) toward a specific goal or task. One may imagine, for instance, that deploying other point-sources of the trail pheromone component may guide the resulting optimal paths in various ways.

Crucially, the external pressure provided by the additional point source of the trail pheromone was not applied via some explicit change to control logic of the local agents. In the second example, the paths still formed optimally subject to the additional constraint within the environment, while the inner workings of local agents (ants) stayed the same—it is just that the extra pheromone affected some of the local decisions by changing the environment. This illustration suggests that in order to consistently define GSO we may need to elaborate on the second feature of self-organization as follows: (ii) the local interactions are not explicitly guided by any external agent. In this context, by an explicit effect, we mean a change within the agents’ decision-making mechanism, while an implicit change assumes a modification to the environment. Such an elaboration would not violate the spirit of many adopted definitions of self-organization (Prokopenko et al. 2009), relegating all implicit guidance sources to the level of additional, task-dependent, system constraints.

Finally, to make GSO more specific one may need to add another feature: (iii) task-independent objectives are combined with task-dependent constraints. For instance, the actual path formation is understood in this context as an example of a task-independent objective (which is ultimately relevant to the global ant colony survival), while additional point-sources of the trail pheromone component may correspond to various tasks such as development of specific network topologies.

The interpretation suggested above provides some means to consistently incorporate numerous ways that guidance may be given to a self-organizing system/process, suggesting to treat these ways as additional constraints imposed on the system under consideration. The program of The Second International Workshop on Guided Self-Organization (GSO-2009) included 19 presentations where GSO aspects were presented and investigated. In particular, the workshop puts an emphasis on principles based on information flows through the perception–action loops of embodied systems—relating GSO to the notion that cognition and action emerge from interactions between brain, body, and environment while optimizing task-independent objective functions (Ay et al. 2008; Polani et al. 2007). The contributions to this special issue are grouped into three clusters corresponding to (i) guided self-organization in robotic systems (Martius and Herrmann 2011), (ii) information-theoretic studies of the sensorimotor loop (Ay et al. 2011; Capdepuy et al. 2011; Still and Precup 2011), and (iii) self-organization in information processing networks (Boedecker et al. 2011; Greshenson 2011; Lizier et al. 2011).

Martius and Herrmann (2011) postulate that autonomous robots can generate exploratory behavior by self-organization of the sensorimotor loop, and show that the behavioral manifold determined in this way can be modified in a goal-dependent way without reducing the self-induced activity of the robot. Three presented strategies for guided self-organization are then analyzed and evaluated for two different robots in a physically realistic simulation, using: (a) external rewards, (b) a problem-specific error function, (c) assumptions about the symmetries of the desired behavior.

Still and Precup (2011) continue the theme of maximizing predictive power in information-theoretic terms, applying the idea to the problem of exploration in reinforcement learning and curiosity-driven learning. They propose that, in addition to maximizing the expected return, a learner should chose a policy that maximizes the predictive power of its own behavior, measured by the information that the most recent state–action pair carries about the future. The proposed optimization principle suggests that exploration emerges as a directed behavior that optimizes information gain, rather than being modeled solely as behavior randomization.

Capdepuy et al. (2011) propose a formalism for studying embodied cognition among multiple agents that allows to (i) identify information flows and their limits under various scenarios and constraints, and (ii) use informational quantities in order to induce the self-organization of the agents behavior without any externally specified drives. The central question investigated in this article is the impact of coordination between agents, and it is shown that, under some conditions, self-organizing systems based on information-theoretic quantities have a tendency to spontaneously generate coordinated behavior. Moreover, the information-theoretic limits on what agents can achieve with and without coordination put specific constraints on the mechanisms underlying self-organization in the system.

Ay et al. (2011) further explore the idea that living beings are information processing systems and that the optimization of these processes should provide an evolutionary advantage, and study the use of the predictive information (PI) of the sensorimotor process. This measure is applied utilizing the dynamical systems approach to robot control. The study derives exact results for the PI together with explicit learning rules for the parameters of the controller. Interestingly, these learning rules are of Hebbian nature and local in the sense that the synaptic update is given by the product of activities available directly at the pertinent synaptic ports. Overall, this study shows that the learning rules derived from the maximum PI principle are a versatile tool for the self-organization of behavior in complex robotic systems.

Gershenson (2011) investigates Random Boolean networks (RBNs) as self-organizing systems and reviews seven different methods for guiding the self-organization of RBNs. In particular, the work is focused on guiding RBNs toward the critical dynamical regime, which is near the phase transition between the ordered and dynamical phases. The properties and advantages of the critical regime for life, computation, adaptability, evolvability, and robustness are revised. Gershenson argues that the guidance methods of RBNs can be used for engineering systems with the features of the critical regime, as well as for studying how natural selection evolved living systems, which are also critical—resonating with the work of Boedecker et al. (2011).

Lizier et al. (2011) also turn the attention to criticality of complex systems and distinguish complex computation via its coherent structure in the corresponding local information dynamics. This leads to the observation that complex systems exhibit very highly structured coherent computation in comparison to: (a) ordered systems, which exhibit coherence but minimal structure in a computation dominated by information storage or non-interacting transfer structures; and (b) chaotic systems, whose computations are dominated by rampant information transfer eroding any coherence. The presented rigorous methodology identifies both clear and hidden coherent structure in complex computation, and suggests that coherent information structure may be a useful intrinsic goal in the domain of guided self-organization.

Boedecker et al. (2011) investigate information processing in randomly connected recurrent neural networks. It has been shown previously that the computational capabilities of these networks are maximized when the recurrent layer is close to the border between a stable and an unstable dynamics regime, the so-called edge of chaos. The adopted information-theoretical framework allows the authors to find evidence that both information transfer and storage in the recurrent layer are maximized close to the edge of chaos, explaining why guiding the recurrent layer toward the edge of chaos is computationally useful. As a consequence, this study suggests self-organized ways that improve performance in recurrent neural networks, as well as sheds some light on reasons why biological systems are tuned into this specific regime.

The selection of contributions to this special issue highlights an emerging trend in studies of embodied systems motivated by GSO: identification and application of optimization principles that quantify specific information flows through the perception–action loops. These papers illustrate several important innovative concepts, further advancing studies of guided self-organization. Overall, the reported results strengthen the notion that cognition and action may self-organize from interactions between brain, body, and environment under appropriate constraints guiding the process, and open several avenues for future research.

Notes

Acknowledgments

The authors would like to thank all the reviewers for this special issue for their timely responses and useful comments. The authors also appreciate the professional effort of all authors who have contributed to the special issue. Last but not least, the authors are grateful for the support provided by the organizers of GSO-2009 in the Max Planck Institute for Mathematics in the Sciences, Leipzig.