Abstract

The sequencing of the human genome raises two intriguing questions: why has the prediction of the inheritance of common diseases from the presence of abnormal alleles proved so unrewarding in most cases and how can some 25 000 genes generate such a rich complexity evident in the human phenotype? It is proposed that light can be shed on these questions by viewing evolution and organisms as natural processes contingent on the second law of thermodynamics, equivalent to the principle of least action in its original form. Consequently, natural selection acts on variation in any mechanism that consumes energy from the environment rather than on genetic variation. According to this tenet cellular phenotype, represented by a minimum free energy attractor state comprising active gene products, has a causal role in giving rise, by a self-similar process of cell-to-cell interaction, to morphology and functionality in organisms, which, in turn, by a self-similar process entailing Darwin's proportional numbers are influencing their ecosystems. Thus, genes are merely a means of specifying polypeptides: those that serve free energy consumption in a given surroundings contribute to cellular phenotype as determined by the phenotype. In such natural processes, everything depends on everything else, and phenotypes are emergent properties of their systems.

1. Introduction

The sequencing of the human genome, completed in 2001, implied that abnormal alleles could be associated with common disease traits. In the event, 12 years on, this promise has not been fulfilled. The misnamed missing heritability has triggered the search for the ‘hidden’ allele quality that would be responsible for these traits, whereas genomewide association studies (GWAS) have uncovered ‘hundreds of common variants whose allele frequencies are statistically correlated with various illnesses and traits…. the vast majority …. have no established biological relevance to disease or clinical utility for prognosis or treatment’ and so now whole genome sequencing is held out as the answer [1, p. 213]. However, studies of identical twin pairs, which allow outcomes from two identical genome sequences to be compared, show that, for the majority of common diseases, knowing the causes of death or disease history of one twin gives only marginal guidance as the causes of death or disease suffered by the other [2]. Against this uncertain background, plans are being put in place to sequence the whole genomes of large numbers of individuals (tens of millions) in pursuit of personalized medicine (see https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/213705/dh_132382.pdf).

More fundamentally, biology has no substantive answer to the question ‘from where does the phenotypic complexity of higher mammals derive?’ What might be called the ‘genomic input’ for human cells, in terms of numbers of gene coding sequences, splicing potential, diversity in peptide folding and measured interactome size, can be stretched to a factor of a few thousand compared with bacterial cells, but the output, in terms of phenotypic function, is vastly greater. This raises the question: ‘are these components of the input really the most relevant ones?’ Take, for example, the round worm, Caenorhabditis elegans, with about 20 000 gene coding sequences and, on average, about five exons per sequence, compared with the human cell, with possibly 25 000 gene coding sequences and about eight exons per coding sequence, i.e. almost the same, but the human cell giving rise to organisms of vastly greater phenotypic complexity than roundworms. Of course, there are differences: the human genome is 30 times longer than that of C. elegans, but are there sound theoretical reasons for supposing that this makes the grand difference?

It is proposed here that by basing biology on the physics of dissipative systems governed by the universal law of nature, i.e. the second law of thermodynamics (hereinafter the second law), insights into the origin of phenotypic complexity can be derived, which, in turn, shed light on the ‘missing heritability’ issue. Moreover, the viewpoint allows us to evaluate the prospects for personalized medicine based on whole genome sequencing. We emphasize that no new science is involved in what follows; simply the consequences of the basic physics are followed to their logical conclusions, irrespective of whether the contemporary conceptual basis underpinning biology would support them. Today, the question ‘how biology works’ is considered a ‘hard problem’ scientifically because it appears very complicated, messy and diverse in a molecular (proteins and ribonucleic acids) size range, the mesosphere, which has been relatively little studied [3]. Instead, practical progress in manipulating the process of life made over the past half century is largely attributable to recombinant DNA technology. Therefore, molecular biological and genetic research is contingent on the assumptions inherent in the gene-focused modern synthesis, or neo-Darwinism, but important questions, including what the nature of life is [4] and how it originated [5], remain unresolved. Instead, the priority is to handle the vast quantities of data being produced by genome sequencing and to relate the data to perceived traits ranging across the board from physical characteristics, through pathological illnesses, to cultural, social and even political, preferences. However, these implicitly deterministic assumptions and undertakings have been questioned [6] and challenged at a foundational level [7,8].

2. The proposal

The predominant metaphor for the biological cell is the man-made machine. It justifies a materialist and reductionist/constructionist approach to biology [9]. While reductionism to identify fundamental factors and causal relationships is legitimate, the constructionist phase is problematic because of the phenomenon of emergence. Due to the fact that symmetry-breaking1 is involved in growth, differentiation, proliferation, etc., in cells and in the development of organisms ‘in general the relationship between a system and its parts is intellectually a one-way street’ [10, p. 396]. This is because new qualitative properties, which are by no means easy and usually impossible, to predict, emerge when symmetry is broken. Yet, the missing ingredient is the obvious, but often overlooked, photon absorption from the surroundings that invariably leads to the breaking of a system's symmetry [11]. The incorporation of a single photon may open up entirely new paths for free energy consumption, e.g. as chemical bonds that constitute new compounds which provide new means for the organism to surpass its rivals.

It is self-evident that cells are thermodynamically open to acquire or expel energy, in the form of information and matter from and to their environments. Specifically living systems dissipate energy inwardly, because they exist in environments that are richer in energy than they are. These dissipative features, the metabolic pathways along which energy is conducted, are termed actions in physics. The energy transduction along the paths is governed by the fundamental principle of least action for open systems as proposed by De Maupertuis. However, his non-deterministic resolution was subsequently misunderstood and reduced by his successors to a deterministic equation which applies only to stationary systems [11]. In simple terms, this principle of Maupertuis requires that any energy gradient will be levelled as efficiently as the prevailing conditions allow. Thus, organisms in an energy (nutrient)-rich environment will transduct as much energy as conditions allow inwards as quickly as possible. This consumption of free energy equates with the second law of maximizing entropy [12]. In the case of an open and evolving system competing for free energy with other systems, the most probable state (maximum entropy) will be associated with maximum organization/complexity when this embodiment minimizes the action. Thus, selection will act on variation in free energy consumption, often manifesting itself as high metabolic efficiency, reproduction rate and motility, etc. [12]. Since Boltzmann's time, entropy has been erroneously associated with disorder; however, in thermodynamic terms, entropy is a measure of bound and free energy. In a growing organism, just as in an increasing population, entropy comprises both bound and free energy. Eventually, when the maximum entropy state has been attained, all energy is bound in the soma just as is the case when the maximum population density is attained [13]. In an ecosystem, bound energy associated with a species is regarded as a source of free energy, e.g. in a form of food for other species. In open systems, the influx of energy from the environment to the system (symmetry-breaking) is a cause of new properties as is evident by the difference in properties between an un-reacted and a reacted mixture of chemicals [14]. For example, gaseous oxygen and hydrogen, when reacted (exothermically) produce liquid water, or gaseous hydrogen and nitrogen, when reacted (endothermically) produce pungent ammonia, a liquid or nearly so, at room temperature.

On the basis of the equation for evolving systems, i.e. Maupertuis’ principle [12], it can be concluded that any natural process, from individual organisms through ecosystems, to global biota, can be seen to tend towards increasing complexity owing to natural selection favouring the least action, the most efficient free energy consumption [12]. Thus, rather than ends in themselves, organisms are seen to be pre-eminently the manifestation of the supreme law of physics.

The thermodynamic tenet proposed here should not only be regarded as an intriguing option, but reasoning that is fully consistent with observations. Specifically, the evolutionary equation reproduces ubiquitous patterns, i.e. skewed distributions, logarithmic spirals, sigmoid curves, branching structures, scale-free networks and power laws that make no distinction between animate and inanimate objects. Therefore, we argue that the second law is demonstrably a secure foundation upon which to rebuild the science of living systems.

3. Relevant implications

The implications that follow from this insight into physics are profound for biology. The emphasis on the acquisition of free energy and, therefore, metabolism, as the driving force of evolution implies that the origin of life was ‘metabolism-first’, rather than ‘replication-first’ [15]. Pascal et al. [16] note that the origin of life must involve dissipation of energy. According to the second law, replication is no objective itself, but a means to consume more free energy more rapidly than, for instance, by simply growing bigger [17] (see below). A metabolism-first origin of life was first proposed by Oparin in 1926 and further developed by Dyson [18], however, without explicitly identifying free energy as the driving force. More recently, abiogenesis has been treated within the framework of complex dissipative systems [19].

In evolutionary terms, regulation of energy transduction (metabolism) would be a clear strategy for minimizing actions within the cell and given a ‘metabolism-first’ origin, such regulation would most likely be based on the components responsible for metabolism, namely the gene products: in modern cells, proteins, not genomic DNA. This implies that an epigenetic2 mechanism for the regulation of the cell and a model for such has been proposed for mammalian cells [20] (applies to any eukaryotic cell) and compared with competing genetic regulatory mechanisms [7].

True replication, in contrast to the simple splitting of a cell into two, is favoured when it allows free energy to be consumed more effectively. Moreover, the natural selection of least time free energy consumption will favour any improvements in transduction efficiency that the mere replication might bring about. In other words, in abundant circumstances, evolution tends to progress to greater complexity/organization3 to find more effective means to consume free energy.

Epigenetic cell regulation is based on a minimum free energy attractor state4 (representing cellular phenotype) achieved through interactions, according to ‘rules of engagement’, between active gene products, i.e. proteins derived from peptides by folding and other post-translational processes such as phosphorylation [20]. It is useful to envisage the cell/system as comprising a state space with a dimension for each active gene product arbitrarily calibrated from zero to the maximum expressible activity. The attractor location (a profile of typically up to a few thousand gene products and their activities) in the state space represents the phenotype and the basin of attraction provides robustness to perturbation of the phenotype. Violation of the rules of engagement between gene products can cause an irreversible attractor/phenotype transition to a variant attractor and, thus, a variant phenotype. Such transitions are discontinuous, i.e. jumps from one thermodynamic steady state to another. The attractor of a cell from a stably replicating species is termed the ‘home’ attractor: it has been evolutionarily conditioned to optimize the integrity of replication [20]. The state space specified by the human genotype potentially contains a very large number of attractors to consume free energy under various circumstances [7].

Although both the DNA and the attractor state are inherited at cell division and fusion [7], the attractor state is the more fundamental of the two in terms of the inheritance of phenotype. This is clear from the fact that in a generic sense there is no contiguous information flow from the genotype to the phenotype [21]: the, in principle non-determinate, peptide folding process [22] acts as an insurmountable barrier to the upward flow of information, i.e. from genotype to phenotype. While genomic sequence stipulates the peptide, subsequent processes under the regulation of the attractor yield the phenotype: in other words, the phenotype is its own cause, and causation acts downwards on the genome and its products [23]. In terms of physics, the evolution of a system affects the driving forces, which, in turn, affect the path of evolution. Therefore, the equation of evolution cannot be solved and hence deterministic causation remains an illusion. In fact, when everything depends on everything else, it is appropriate to speak about an energy transduction network that evolves as energy flows.

Both the peptide folding and the interaction processes constituting the attractor are dissipative. Therefore, the components of the system denoted as actions keep changing owing to the influx of quanta, and hence new characteristics will emerge [14] in the form of phenotypic properties. Thus, this description that accounts also for the invariable influx would predict that at least a proportion, those involving the interaction of two or more proteins, of phenotypic properties are emergent, i.e. they are not reducible to the properties of the products of individual gene coding sequences (genes) or, indeed, the proteins they give rise to. This does not deny that damage to sequences specifying peptides may lead to damaging phenotypic consequences: simply that the genome contains a proportion of coding sequences that cannot be related to specific phenotypic properties as is expected of a gene as it was originally perceived by Mendel. Moreover, it is worth emphasizing that there is no way of knowing a priori whether a particular mutation affecting the phenotype will, in some way, be beneficial or harmful for free energy transduction efficiency in a given environment, because the forces imposed by the surroundings cannot be experienced in the absence of energy conduction.

4. Discussion

The thermodynamic openness of organisms is universally accepted and has been widely discussed by among others [5,24–26], but, curiously, it was not emphasized by Darwin. Many of its implications for cellular processing have been and are still, ignored by mainstream cell and molecular biology. Most striking is the assumption that maximum entropy implies maximum disorder as the most probable state. This is the result of Boltzmann's molecular interpretation of entropy more than 100 years ago. Boltzmann treated his molecular ensembles as systems closed to energy gain or loss with their surroundings. As nothing new can emerge in such stationary-state systems only incoherence, namely disorder, will increase owing to exchange of quanta with incoherent surroundings [12]. Conversely, the system will become more coherent via exchange of quanta with coherent surroundings. Order and disorder are not ends in themselves when an open free energy-consuming system evolves towards its most probable state: however, complex and orderly machinery will be favoured over simplicity when it is a means of allowing more effective free energy consumption. This is empirically demonstrated by Bénard cells [27] as a fulfilment of the principle of least action. In this case, beyond a threshold of energy gradient within a column of liquid uniformly heated at the base, an ordered form of convection emerges to increase the efficiency of energy transduction through the column. After more than 100 years of belief that high entropy must mean disorder, it is conceptually difficult to accept that the highly organized structures of living organisms are manifestations of the quest of increasing entropy. Yet, it is worth recalling that even Boltzmann noted that animates struggle for entropy, not against it. A Darwin contemporary and naturalist, Blyth [28], however, did refer to the role of nutrient (energy) in promoting fitness. He states ‘among animals which procure their food by means of their agility, strength, or delicacy of sense, the one best organised must always obtain the greatest quantity; and must, therefore, become physically the strongest, and be thus enabled, by routing its opponents, to transmit its superior qualities to a greater number of offspring’. This statement can be regarded as recognizing the role of the law of least action and could be, in fact, an abstract for the model presented here. Blyth's comments, although made several years before Darwin published the Origin of species, have been dismissed because they implied that natural selection was directed to stabilizing species and not to evolutionary change [29]. However, evolution can be regarded as a non-gradual process, as in, for example, the theory of punctuated equilibrium [30], which is consistent with the thermodynamic account given here, where stress on cellular processes can trigger an attractor transition to a genomically unstable state [20], which may ultimately lead to a new species [7]. Natural selection can be seen as having both the role to stabilize species (during the ‘equilibrium’ periods, or maintaining the home attractor) and of selecting better adapted organisms in the ‘punctuations’, or phase of genomic instability. Blyth, it seems, like Wallace, got rather less recognition than he deserved in the context of evolutionary theory.

One of the implications of thermodynamic openness that often is ignored, although not by Rosen [31], is for peptide folding to proteins. In the environment of an energy-dissipating system, i.e. a cell, peptides are not bound to fold to the lowest energy tertiary structure present in plain water. Neither are they, contrary to Anfinsen's dogma [32], constrained to folding to a structure dictated by the amino acid sequence alone. In an open system, protein folding is not a random, i.e. indeterminate, but a non-determinate dissipative process [22], which in the cell is commonly overseen by chaperone and co-chaperone proteins [33]. In general, therefore, no predictions of tertiary protein structure are possible from the information contained in the DNA sequence alone. Furthermore, recent evidence shows that there is far from a one to one agreement between the transcriptome and the proteome [34].

The tenet presented here indicates two inter-related reasons why it has so far not proved possible to forge a clear relationship between genotype and phenotype, except in the case of a limited number of coding sequences. In these latter cases, it appears a sequence uniquely specifies a peptide that folds to a single, specific protein and that protein acts alone in the phenotype. An example is the rare disease Ehlers–Danlos syndrome, where a mutated peptide is unable to fold into a protein with the properties (information) necessary to form sound collagen tissue. Mendel's experiments with pea plants and much of twentieth century experimental genetics studying very marked and, thus, easily measured traits, probably fall in the same category and are exceptions to the general rule [35]. The majority of coding sequences in the human genome lead to more than one peptide per sequence through diverse splicing of exons and each of these peptides may, through multiple folding opportunities, lead to more than one protein, which may be activated in a number of ways, for example, by phosphorylation. These proteins interact with each other according to the rules of engagement (information acquired upon folding and post-translational processes) to contribute to the output of the attractor [20]. These energy dissipative processes are symmetry-breaking and potentially give rise to emergent and irreducible (to the originating DNA sequences and indeed proteins) phenotypic properties.

What is termed the ‘missing heritability’ is manifested as a failure to be able to account for the genetic variation of complex traits (including common diseases) in terms of abnormal alleles containing single nucleotide polymorphisms (SNPs) as detected through GWAS [36]. GWAS might be described as a short cut in attempting to relate the genotype to the phenotype without sequencing the whole genome. The failure might be, therefore, due to problems with GWAS as a technique or, it may be due to there being no relationship to be detected. Empirical evidence strongly suggests the latter. The raw data in a report of a study comprising 50 000 identical or monozygous (MZ) twin pairs [2] indicate that for four cancers known not to have a strong dependence on family history of the disease, the fraction of concordant pairs, fc, is less than 0.03, and for cancers with a strong family connection (breast and colon cancers), fc < 0.1. MZ twins share identical genotypes and in a high proportion of cases closely similar environments. Therefore, if genetic risk is a major component of overall risk, then values of fc closer to unity would be expected. On the other hand, if the genetic risk were a small component of overall risk, then the co-habitation of the twin pairs would likely be the main contributor to concordance. This is illustrated by chronic fatigue (fc < 0.26) which is likely viral in origin, or diseases where domestic environment is known to strongly influence risk, such as coronary heart disease (fc = 0.25) through life style and diet, and lung cancer (fc < 0.06) through secondary exposure to tobacco smoke. The evidence, therefore, does not support a strong component of genetic risk, especially for cancers, which account for approximately 30% morbidity in populations of industrialized countries.

From the above, it is clear that the origin of cellular complexity is not exclusively the genomic input, but everything, in particular the surroundings, because everything depends on everything else. This is in agreement with the view that ‘genes’ do not have a privileged position in terms of causality [37,38] and do not constitute ‘the book of life’ [39–43]. The role of surroundings is obvious in dissipative intracellular processes, such as polypeptide folding and protein interaction. The free energy input that powers the development of organisms results in the renowned symmetry-breaking [10]. Woese [44], in a criticism of the current reductionist-based cell and molecular biology, has already proposed that cellular complexity relies on protein interaction. In effect, because the emergence of multicellular organisms based on eukaryotic cells, the expansion of information (complexity) output from the phenotype has been due to the greater complexity of active protein interactions within the attractor, rather than primarily the result of adding more components (ultimately the products of diverse coding sequences) to the attractor. This expandable component of cellular processing can be regarded as responsible for the observed diversity of organisms (see below) and account for the fact that markedly morphologically and functionally diverse organisms can have nearly identical genomic sequences. For example, the mouse has nearly as many gene coding sequences as the human, many the same, with a considerable degree of synteny (sequence ordering in the chromosomes), yet is phenotypically quite distinct. The genomic sequence of chimpanzees is even closer to that of humans (99% concordance where sequences are common to both; see http://www.nature.com/scitable/knowledge/library/primate-speciation-a-case-study-of-african-96682434), yet again, there are marked phenotypic differences. This phenomenon is a consequence of two factors. First, while both the DNA and the attractor are inherited at cell division and fusion, as noted above, the attractor is the more fundamental in terms of inherited phenotype because it is the origin of phenotypic causality [23]. Second, phenotypic output is contingent on the position of the attractor in the state space of active gene products [20]. Thus, it is proposed that mammalian morphological diversity derives primarily from this expandable reservoir of complexity providing information for cell-to-cell communication and consequent aggregation of cells into complex body and function plans, i.e. organisms. That is, form and function are determined by the position of the attractor in a (for mammals at least) nearly universal conceptual state space based primarily on the same potential reservoir of peptides, which are ultimately derived from a finite set of coding sequences which are to some degree modifiable for adaptive needs [45] by the phenotype itself.

The notion of a process of protein interaction represented by an attractor state is an important component of the model presented here. Briefly, Waddington [26] used the attractor concept metaphorically [46] in the context of an epigenetic landscape. The term ‘equifinality’ was used by von Bertalanffy [24] to refer to an attractor and for that in an open system. He distinguished it from homeostasis through feedback, such as in the case of a thermostat or the classic bi-stable switch [47]. Kauffman [48] also invoked attractor states in autocatalytic sets, and in Boolean matrices as a metaphor for cell fate. In the case of the latter, there is a clear distinction to be drawn between cell fate and phenotype, see [23]. The formalized concept in the context of the cell applied here [20] defines the attractor as a cellular function.

Rönkkö [49] has demonstrated how emergent lifelike properties can be simulated on the basis of the application of rules of interaction between information bearing particles (‘atoms’). This modelling procedure is not constrained to any particular morphology, thus, in the case of multicellular organisms, where the constituent cellular phenotypes would be the information bearing particles, there would be no constraint on the body plans that could emerge: natural selection, however, would favour those best able to transduct energy from their specific ecosystem/environment. In Rönkkö's artificial life model, functions would be simulable by deploying specifically differentiated cells within the body plan with similar interaction rules. Thus, the phenotype of the organism can be seen as deriving from information in the cellular phenotype in self-similarity with the way cellular phenotype is derived from interacting proteins.

At this point, it is necessary to consider the role of the ecosystems within which organisms are embedded. Darwin fully recognized in the Origin the enduring nature of ecosystems and the relative constancy of the proportional numbers of the various organisms populating them, but, at the same time, he recognized their vulnerability to disturbance. He takes the example of grazed heathland near Farnham in Surrey, UK, populated by a few isolated clumps of mature Scots pines. Fencing off a section of the land to keep cattle out led to a rapid outgrowth of sapling Scots pines within the fence. Examination of the surrounding heathland revealed the presence of Scots pines, their growth prevented by the grazing cattle. Here, again, self-similarity is encountered: the stability of the ecosystem is contingent on the relative constancy of the proportional numbers of the correct, in fact, what Edward Blyth identified as ‘best organised in terms of agility, strength, and delicacy of sense’, constituent organisms. However, the ecosystem is important in another respect in the proposal presented here, namely in providing the context within which the morphological and functional features of organisms evolve. The principle of least action dictates that organisms will adopt, within the prevailing constraints, the best way available to extract free energy from the ecosystem. That is interpreted here as saying that the information, at the level of cellular phenotype, evolves (adapts) to optimize the body plans and functions such that they are most efficient at extracting the energy available from that specific ecosystem. The ecosystem, therefore, has a role in the emergence of phenotype at the organism and cellular levels, as indeed, organisms play a role in the evolution of the ecosystem by, for example, creating niches for other species to exploit. In conventional biology, genetic variation is proposed to account for adaptation to environment, and speciation; in the model presented here, it is the variation in the deployment of proteins contributing to the attractor and the possibility of attractor transitions stimulated by stress from the environment that are responsible for macro-evolution, i.e. speciation [20]. It should be noted that attractor transitions caused by violations of the rules of engagement between gene products are not gradual, but rather are ‘jumps’ in which the participation, including the degree of activity, of several gene products can change in a single transition [20]. On the other hand, attractors become conditioned to the environment by a gradual process of adjusting the position of the attractor in the state space, evolutionary conditioning, to optimize the integrity of replication.

It may be assumed, therefore, that the ancestor common to roundworms and mammals provided an expandable potential, through the available peptides encoded in the genome, for exploitation in terms of morphology and function. C. elegans adapted to be able to function in most ecosystems (in soil), whereas the mammalian branch evolved in a more limited range of more sophisticated ecosystems, and were better able to exploit much more fully the opportunities that could be derived from the expandable range of protein interactions. This should have led to the potentially testable situation where a greater proportion of phenotypic traits have a one to one association with coding sequences in C. elegans compared with, to say, Homo sapiens. The challenge here is to make a quantitative assessment of phenotypic output.

Treating ecosystems in terms of thermodynamics Schneider & Kay [50, p. 167] argue that ‘life is a response to the thermodynamic imperative of dissipating [energy] gradients’. Biological development occurs when new pathways (actions) for degrading energy emerge. The authors propose that consequently the more developed an ecosystem the lower will be the re-irradiated black body temperature (free energy) and cite evidence for this in terms of measurements of surface temperatures of various ecosystems, which show a trend to lower values the more developed the ecosystem. Equally, it is known that the ‘density’ and diversity of life in ecosystems varies with latitude given adequate rainfall (cf. tropical and temperate forest; see http://www.nature.com/scitable/knowledge/library/terrestrial-biomes-13236757) as do the body masses for organisms with few physical constraints on the size to which they can grow, such as snakes. The largest known fossil snake, Titanoboa, was found in Columbia close to the equator [51].

Given the traction that genetics has had over modern biology for the past 60 years, it is easy to forget the extent of the debate prior to that period and since, over whether the origin of heredity lay in the nucleus or the cytoplasm [52]. The model advanced here prioritizes inheritance from the cytoplasm but, most importantly, it is the inheritance of ‘process’ not ‘material’. Griffiths & Gray [53] point out that inheritance cannot be solely nuclear as the organelles in the egg as well as many other features are inherited and that this is not controversial in the context of development, but is in the context of evolution. Demonstrating the inheritance of process directly is impossible, because, for the newly formed cells, DNA is essential to transcribe the peptides needed for cell function. However, that cells are regulated from the cytoplasm has been demonstrated and is exemplified in humans. Early experiments showed that enucleated fibroblasts could survive in vitro and appear normal in all respects other than not having a nucleus [54]. Enucleated fibroblasts with functional hypoxanthine phosphoribosyltransferase (HPRT) enzyme activity are able, through the formation of gap junctions and the transfer of nucleotides or their derivatives, to correct HPRT-deficient cells, whereas the karyoplasts (nucleus and remnants of cytoplasm) of the HPRT-competent cells are not [55]. These results demonstrate clearly that complex communication between cells can be initiated and take place in the absence of ‘genes’. Furthermore, erythrocytes (red blood cells) are enucleated as they are released into the blood stream, but are still capable, within their two months lifetime, of exhibiting complex phenotypic features such as circadian rhythm [56]. In fact, a largely temperature-resilient 24 h cyclic phosphorylation of one of the three proteins responsible for circadian rhythm in cyanobacteria can be reconstituted in vitro with extracted proteins incubated with ATP [57]. Finally, Tardigrades, small water-dwelling animals, which exhibit extraordinary resilience to environmental stresses, including ionizing radiation [58], seem to owe these properties to being eutelic, i.e. their somatic cells do not divide after hatching from the egg. Thus, once the organism is hatched, the DNA of their somatic cells is of little consequence to cell function. Heavily irradiated adults are able to lay eggs but they do not hatch. Developing eggs are radiosensitive in the early stages and only acquire resistance to radiation in the final stage of development [59], presumably when cell division is no longer required. Tardigrades can be regarded as revealing epigenetic regulation in a multicellular organism, which requires undamaged DNA primarily as a template for replication.

If the metabolism-first origin of life is assumed and it is fundamental to the reappraisal described here, then peptide sequence must have been encoded on to DNA at the transition point between ‘nearly life’ and true life as it exists today, that is, approximately 3.5 billion years ago. As noted above, without true replication, gains in the efficiency of energy transduction from the ecosystem/environment would not be able to accumulate efficiently (evolution would be much slower, or even not occur), so there would be an obvious advantage in adopting true replication. How this was achieved at the molecular level is unclear, but it is not necessarily an insurmountable obstacle as mechanisms for reverse translation (protein to RNA) have been proposed [60–62] and the protein, reverse transcriptase (facilitating the translation of RNA sequence to DNA sequence) exists. Reverse translation requires the amino acids of the peptide to couple with their tRNA base unit counterparts to form the template RNA that can be polymerized to mRNA. Of course, the chemistry that preceded true life was not necessarily based on peptides and, so other routes must remain a possibility. As noted by Cook [61], once truly replicating cells existed, they had no use for reverse translation, so the ability would likely have been lost in the 3.5 billion years of evolution that followed. One potential objection to reverse translation as a means of introducing DNA coding in the context of a metabolism-first origin is how useful peptides were coded, whereas useless ones were not. It might then be speculated that in the ‘nearly life’ phase DNA/RNA coding for a vast diversity of peptides accumulated, constituting a DNA/RNA database and those sequences that proved useful led to bacteria at the true origin of life. It is notable that the range of peptides deployed by bacteria is hugely greater than that deployed by mammalian cells: the bacteria inhabiting the human gastrointestinal tract deploy in excess of nine million different peptides [63]. It also seems likely that bacteria operate on a one-to-one basis between coding sequence and phenotype, but achieve complex traits by cooperation between diverse species/strains [64]. By contrast, eukaryotic cells have achieved much greater multicellular complexity with a very much smaller range of peptides exploiting intracellular cooperation. This is not to deny a role for the very different organization of the DNA and the presence of membranes, organelles, etc., in the eukaryotic cell, as, no doubt, the cytosol plays a crucial role in facilitating the protein–protein interactions. It is interesting to note that fossilized communities of cyanobacteria (stromatolites) and biofilms [65] date back to close to the origin of life, so it would seem that some form of multicellularity has long been the norm for the life process.

5. Conclusion

The thermodynamic tenet presented here represents a major departure from conventional thought on the basis for evolution and its products. It refocuses, for purely physical reasons concerning the role of energy in the natural process called life, attention on the role of metabolic processes and, therefore, proteins, rather than DNA, and the cytoplasm, rather than the cell nucleus. First and foremost, the model is based on the physics of dissipative systems, fully embracing the implications at the molecular level of thermodynamic openness and the quest of attaining stationary status of the cell/organism/ecosystem with its surroundings in least time. A supreme law of physics governs the life process, namely the law of least action equivalent to the second law and has, through the former harnessing natural selection and the latter being responsible for producing entropy in the form of bound energy (matter), resulted in the diversity of organisms extinct and extant. The result of this evolutionary processes can be viewed on three levels, namely the cell, the organism and the ecosystem, each level, cellular phenotype, organism phenotype and ‘ecotype’ being represented by an attractor state comprising, respectively, proteins, cells and organisms, yielding through self-similar processes, emergent properties at the higher respective levels.

Growth and reproduction can be seen as processes that, in the natural ecosystem context, most efficiently dissipate the incident free energy from the Sun, each level of the hierarchy seeking the minimum free energy state in relation to its own environment, i.e. proteins within cells, cells within organisms, etc. In this context, the emergence of H. sapiens from the Stone Age onwards, some 6000 years, 17 thousandths of a per cent of the total duration of life on the Earth, most probably represents a unique departure from that which prevailed before. Many species contribute their proportional numbers to more than one ecosystem, C. elegans, for example, being almost ubiquitous in soils, but has any other species than H. sapiens so grossly over contributed its proportional numbers to the extent of severe disturbance and obliteration of some ecosystems and the annihilation of so many other species? For example, a current threat to marine ecosystems is the increasing dominance of jellyfish over other organisms, most likely resulting from over fishing and eutrophication by washed-off nitrogen fertilizers [66]. Based on the arguments above, ecosystem endurance has been integral to the evolutionary processes of adaptation that has improved the efficiency of energy transduction and entropy production. Prior to the Stone Age, it would seem that climatic change has been the primary threat to the endurance of ecosystems. Each ecosystem has its top predator, but clearly, in general, they have not abused that position or H. sapiens would not have evolved.

Within this framework for biology the gene, as it is generally regarded, is a merely mechanistic, not a profound, concept. While gene coding sequences are inherited, they are not the ‘units of inheritance’ discovered by Mendel: those are the processes that contribute to the attractor which represents, at the cellular level, the phenotype. The emergent nature of phenotype precludes reducing it to the actions of individual proteins. Furthermore, there is no contiguous deterministic pathway between the information in the gene coding sequences, from which the proteins are derived, and the information inherent in the active proteins participating in the attractor. Thus, population genetics is founded on a subset of coding sequences that can be related to phenotype in a statistical sense, but not based on causation or a viable causal mechanism: genetics, as it is understood today, does not have any biological significance.

The evolution of complexity is a puzzle of long-standing predicated on the belief that maximum entropy, as dictated by the second law, would mean maximum disorder, as it is ascribed to thermodynamically closed systems.5 The insight that the second law dictates maximum complexity in open systems capable of evolution goes part of the way to resolving this puzzle. The second part of this solution is the adoption by eukaryotic cell systems of a methodology for achieving more complex outcomes at the cellular phenotype level through complex post-translational ‘protein chemistry’, fuelled by a limited number of peptides and an even more limited number of coding sequences. While the mechanistic details of this ‘chemistry’ are far from unravelled, its existence cannot be doubted on the basis of empirical evidence, logic and the underpinning physics. Perhaps, most controversially, it leads to the conclusions that cellular function is an emergent property and is expressed through downwards efficient causation from the phenotype to the genotype.

Thus, while gene sequencing may assist in understanding the origins of life forms, as antiquarian books help to reconstruct human cultural history, it is predicted here that the healthcare revolution anticipated by the UK's Human Genomics Strategy Group [67, p. 14] of ‘patient diagnosis and treatment based on information about a person's entire DNA sequence, or “genome”—becoming part of mainstream healthcare practice’ is over-optimistic.

Acknowledgements

We acknowledge the insightful comments and encouragement of our reviewers.

Footnotes

↵1 Symmetry, in this context, is invariance of the system when viewed from different perspectives. Liquid water, at the molecular level, looks the same from every geometrical perspective, but ice does not; the symmetry has been broken by the phase transition.

↵2 In this context, the term epigenetic means ‘over and above genetics’ and has no connection to the more recent meaning attached to the term namely, chromatin and DNA marking.

↵3 There are of course examples of loss of, for example, sight when organisms adapt to living in darkness and certain bacteria have lost metabolic complexity when adapting to a specific nutrient-rich environment, as in the case of Mycoplasma pneumoniae in the lung, but these can also be seen as minimising actions.

↵4 The term attractor is often used in a casual or metaphorical way, but for its use in this context it has been strictly formalized [20] and related to thermodynamic imperative to consume free energy in least time. Moreover, we wish to distinguish from the common, but erroneous consent that an attractor would be a predetermined state by arguing that during the process of free energy consumption also the attractor will move from its initial position in the free energy landscape. For example, when a stem cell begins to differentiate owing to signals from its surroundings, also the surrounding cells will adapt to changes in the differentiating cell. In other words, the non-determinism in the free energy consumption follows from the fact that everything depends on everything else. We work this valuable insight into the powerful notion of an attractor. While it is essentially a process of gene product interaction, it can be thought of as the profile of active gene products (proteins), at any point in time, which is regulating the cell and which, because it is inherited at cell division, is also engaged in the inheritance of cellular phenotype. It is postulated that the information on the contributing active proteins is in the form of ‘rules of engagement’ specifying interactions with other proteins, in much the same way the information on enzymes specifies the substrate. As a result, the profile changes with time leading to the evolution of phenotypic properties of the cell. Thus, cell regulation is postulated to be epigenetic rather than genetic [7] and what is inherited at cell division and cell fusion is a process in addition to material, i.e. DNA. Of course, that dividing cells undergoing differentiation maintain their pre-division state requires that the process be inherited. A property of such a profile is that it is a free energy minimum in the gene product activity state space and, thus, is surrounded by a basin of attraction which endows quasi-stability and leads to stable states (attractors) existing only at discrete points in the state space, that is, phenotype is not a continuum [7] and phenotypic/attractor transitions are not gradual.

↵5 When a system is truly closed, it cannot even exchange energy to become disordered in a disordered environment or ordered in an ordered environment. If, however, the system is allowed to exchange, but not gain or lose energy then the system will attain the same degree of coherence as its surroundings. So disorder is no end in itself, but common because the superior surroundings, i.e. the free space has very little coherence.

. 1835An attempt to classify the 'varieties’ of animals, with observations on the marked seasonal and other changes which naturally take place in various British species, and which do not constitute varieties. Mag. Nat. Hist.8, 40–53.