The Surprising Origins of Evolutionary Complexity

The Surprising Origins of Evolutionary Complexity

Scientists are exploring how organisms can evolve elaborate structures without Darwinian selection

Charles Darwin was not yet 30 when he got the basic idea for the theory of evolution. But it wasn't until he turned 50 that he presented his argument to the world. He spent those two decades methodically compiling evidence for his theory and coming up with responses to every skeptical counterargument he could think of. And the counterargument he anticipated most of all was that the gradual evolutionary process he envisioned could not produce certain complex structures.

Consider the human eye. It is made up of many parts—a retina, a lens, muscles, jelly, and so on—all of which must interact for sight to occur. Damage one part—detach the retina, for instance—and blindness can follow. In fact, the eye functions only if the parts are of the right size and shape to work with one another. If Darwin was right, then the complex eye had evolved from simple precursors. In On the Origin of Species, Darwin wrote that this idea “seems, I freely confess, absurd in the highest possible degree.”

But Darwin could nonetheless see a path to the evolution of complexity. In each generation, individuals varied in their traits. Some variations increased their survival and allowed them to have more offspring. Over generations those advantageous variations would become more common—would, in a word, be “selected.” As new variations emerged and spread, they could gradually tinker with anatomy, producing complex structures.

I’m probably missing the idea here. I thought it was well established that mutations can happen that don’t effect survivability and that was called Genetic Drift. So my naive paraphrase of this article would be “sometimes genetic drift can happen to increase complexity” which doesn’t seem earth shattering.

So far the most interesting part of the article for me was this quote from one of the critics: “According to the law, complexity may increase in the absence of selection. But that would be true only if organisms could actually exist beyond the influence of selection.”

It’s not just you the idea of genetic drift kept coming up in my mind as I read the article too.

I also struggled to understand how laboratory bred flies were viewed as a neutral reference. A Lab is merely a different environment to evolution. ‘Free’ food in the Lab reduces the cost of variations that, in a wild environment, would be too costly to sustain. An obvious example is the more colorful wings quoted. In the Wild would such an allele not make those flies more visible to predators such as birds?

Clearly, mutation occurs continuously – if it did not there would only be one or two species. By reducing the costs of variation Labs have ensured that their flies have evolved differently. But does this, as the article suggested, equate to a new theory of the emergence of complexity?

Genetic drift, it seems to me, will also allow alleles existing in the original Wild population, transported to and bred in the Lab, to emerge. But no-one seems to be saying that this is part of what the scientists are discovering? Either way, what would be being discovered about complexity here?

One Researcher says that scientists are not clear on their definitions of complexity. But after reading this I can see no reason to move from the dictionary definition.

Basically, Red Dog, it seems to me that one of two things has happened here:

some scientists have spent a good chunk of their careers studying things that prove existing evolutionary theory – but not understanding the results

Scientific American is guilty of some really poor science reporting.

I remain open minded, and will return here regularly to see if someone has a better explanation.

That was a great comment, it was at least as interesting and clear as the article. The one thing I would quibble about is that I think defining complexity is an interesting problem.Its also an issue with computer science, its a crucial metric to define in order to do accurate estimates of how much time and money a system will take to develop. In computer science one of the ways you define complexity is by the number of interfaces between software components, the inputs and outputs, I wonder if it would be possible to define something analogous in biology.

That was a great comment, it was at least as interesting and clear as the article.

Thank you. Given the value of the OP, I wonder if that counts as a back-handed compliment?

The one thing I would quibble about is that I think defining complexity is an interesting problem.Its also an issue with computer science …

I have just spent an hour going round in circles on Wikipedia; complexity, emergence, systems theory, and so on. Forgive me, but I didn’t find it interesting. Are we discussing: Emergence of complexity?

All the articles I could find on complexity seemed to me to be pretending to have an answer to an unspoken question. What was missing was a coherent definition of what makes complexity a unique or useful measure.

On that basis, the Researcher in the OP who complained about people not using a clear definition of complexity seems to be right. But that still leaves me asking: What is complexity in biological systems?

The articles I could find on systems use a twist on whole. Like the OP they make assumptions about discernible bodies, separable from their current environment, which are unwarranted. It’s like saying that Stephen is an identifiable system – a separate whole – and we can study Stephen and make conclusions about Stephen’s ‘complexity’ (whatever that means) based on our ability to identify Stephen and to isolate Stephen.

Note that this comes close to being the antithesis of evolutionary theory. Evolution works because it says that whole systems, complex or not, are inseparable from their environment. Modern studies of viral DNA crossing over into hosts, the fact that many bodies are composed of bacteria that are vital to bodily function, yet separable systems in their own right, bear out the importance of environment at the cellular level. Add to this the rather more obvious, but frequently overlooked, things like chemical environment (Stephen’s whole quickly collapses without oxygen or water, for example).

But I digress.

its a crucial metric to define in order to do accurate estimates of how much time and money a system will take to develop.

In computer science we begin with a target. That target will come from the business environment (business being a very loose definition). Creationists begin the same way, they start by assuming that human life is a pinnacle to be climbed, an end-game to be achieved, a goal to be won. On this basis we see complexity, and we can’t see the wood for the trees.

Evolution starts from the other end and says we begin with nothing and build, one step at a time. Simple and elegant – like all the best science.

In computer science one of the ways you define complexity is by the number of interfaces between software components, the inputs and outputs …

I wouldn’t know, I’m not a Computer Scientist. I am a former ICT Project Manager, and a former Business Programme Manager. I can honestly say that I did everything in my power to avoid defining complexity. My key guiding principle was KISS (Keep It Simple, Stupid).

I wonder if it would be possible to define something analogous in biology.

Why would you care when KISS, so comprehensively embraced (however unwittingly) by Darwin, needs no such definition?

Actually, never mind biology. What on earth are computer scientists doing by studying ‘complexity’? Computer systems evolve from simple beginnings too – as the business environment that bore them evolves.

Look, I’m not an expert in biology and what I know about complexity I learned from Wikipedia in an hour. Maybe I’m missing something?

Basically, this seems to be saying, that if selection pressure is removed in the lab, all sorts of recessive variations will be preserved in the population.

That is a central part of my gripe with the OP. Laboratories do not simply remove selection pressures – they substitute new selection pressures. In the case of the Lab a selection pressure might be to favour larger numbers of offspring (a better return for the Lab’s investment), or favouring flies with a greater ability to survive in a monoculture (the Lab needs simple food requirements). Given time, it would be possible to come up with a whole host of additional potential pressures. To be fair, the OP does cover this, saying that to be comprehensive the fly study should have studied the flies that died before breeding, or failed to develop to adulthood – which they didn’t.

The OP researchers say that they studied flies used in laboratories at the beginning and end of a period of time they selected as appropriate. But I fail to see how they were able to capture – and adjust their results for – the selection pressures that were applied (consciously and unconsciously) by different staff at different laboratories?

Hence the similarity of wolves, coyotes, wild dogs etc. in habitats with low survival ratios, but a huge diversity in over-protected domestic dogs.

Surely Darwin himself used domestication to illustrate how human beings substitute their own selection pressures for the wild environment selection pressures – in order to illustrate his central thesis of natural selection. As reported: This does seem to be being ignored by the researchers in the OP.

The various traits we see in domesticated wolves (dogs) were selected for by breeders. Whether the traits (alleles) we see in the many breeds of dog around today were the result of selection on genetic variations that existed in the original wolf – but where invisible – or the result of selection on later mutations will be very difficult to demonstrate.

This is Red Dog’s concern; the flies at the beginning of the OP research may have had genetic variations that were not visible (because the researchers, reportedly, were using identifiable traits – called alleles – to measure variation between generations and not genetic markers). Thus, the theory of genetic drift appears to have been ignored by the researchers (or, possibly, by the journalist – or both).

The researchers appear to have made two glaring errors – as above.

All this is before we get to what the researchers are trying to achieve. They say that they’re trying to qualify how ‘complexity’ can arise in biological systems – from the OP:

Some argue that life has a built-in tendency to become more complex over time. Others maintain that as random mutations arise, complexity emerges as a side effect, even without natural selection to help it along.

In other words, set aside Darwin’s great insight – natural selection. I confess that I missed this the first time around – but your comment, Alan, made me go back and re-read the OP. Thank you for that.

Although the journalist quotes Douglas Erwin’s critique of the ‘complexity’ being studied – and notes that selection is still required for the ‘complexity’ that the researchers observed to survive into future generations – that still leaves the question of why they thought any new evolutionary insight was being discovered. I’m not a science journalist, but even I can see that this is a dead end.

At one point I got:

… another example of constructive neutral evolution …

Are these people being deliberately confusing, or can they just not make up their minds? Is it constructive or is it neutral?

My understanding of existing evolutionary theory is that it already tells us that mutations occur and that they can have three possible results:

The mutation has no appreciable cost, in the current environment, and may – or may not – therefore be passed down to subsequent generations

The mutation has an appreciable cost, in the current environment, and is less likely to be passed down to subsequent generations

The mutation has an appreciable benefit, in the current environment, and is more likely to be passed down to subsequent generations

A change in the environment may change the mutation’s value.

The OP says nothing about what ‘complexity’ – however it is defined – adds to that theory. The OP wraps with a comment from a biochemist who thinks the idea presented, especially by challenging the notion that all complexity must be adaptive, is a good one.

If that Biochemist had changed one word – exchanging change for complexity – they would simply be describing the above, existing, evolutionary theory!

It’s true to say that putting the word complexity into evolutionary theory adds something – confusion. I think we can do just as well without it.

Complexity is a massive issue / problem in software development. Any software system will quickly become too complex for a human mind to grasp which makes debugging and maintaining it difficult.

In order to keep complexity down experienced software engineers like myself break things down into discreet modules that do a particular job and try and isolate these as much as possible from each other so that they can be worked on in separately.

You then set up interfaces between these modules so that they can “talk” to each other to get things done, the key thing being that each does its job without being aware of the others and simply responds to messages (inputs).
I’m no biologist either but the little I do know suggests that biology uses chemical signals all the time to send message between cells that otherwise have no links.
This is a bit off topic but you did pose the question.

Complexity is a massive issue / problem in software development. Any software system will quickly become too complex for a human mind to grasp which makes debugging and maintaining it difficult.

What both you and Red Dog appear to be saying is that rising complexity can be defined as a rising number of components in a system. Projects are similar. The more people that are involved, the more complex it becomes to reach the project goal – on time and on budget. Also, in both cases, the relationship is non-linear. As the old adage has it: Twins are not just twice the trouble, they’re four times the trouble. This is, of course, also the basis of Chaos Theory.

In order to keep complexity down experienced software engineers like myself break things down into discreet modules that do a particular job and try and isolate these as much as possible from each other so that they can be worked on in separately.

Yes, like Chaos Theory, the increase in complexity makes the system increasingly sensitive to starting conditions – I get that.

I’m no biologist either but the little I do know suggests that biology uses chemical signals all the time to send message between cells that otherwise have no links. This is a bit off topic but you did pose the question.

I’m responding because I don’t think it is off-topic.

Reading your post gave me a possible insight into how the OP researchers are thinking. They’re like graduates who’ve arrived at the biological equivalent of Microsoft and they’re bowled over by the complexity of what they see before them.

In both cases people are starting with the complexity and attempting to work backwards in order to try and work out how these systems work and how they came to work the way they do.

Those of us who’s first PC came before the IBM PC know that it is far simpler to begin at the beginning and work forwards.

This is what Darwin’s theory does. It starts with biology’s equivalent of Tommy Flowers – strapping together some components that are around – then letting society at large (an analogy for the biological environment) mould the future.

As a Programme Manager I would attempt to do the something different to the OP researchers, and to computer programmers. Confronted with the Chief Programmer’s report that changes will cost X,XXX,XXX, take eons and involve every one of 100,000 staff how do I find a solution that only costs XXX, will take six months and will only involve disrupting the work of a dozen staff? I’ve done it too.

I’m not pretending that people are coming up with Darwinian-class insights every day. I’m just saying that, if there’s a proven way to combat complexity it’s going back to basics.

Then again, biology does have the original Darwin insight.

KISS.

I’m sorry I only liked your comment as a Guest – I forgot to sign in first.

In reply to #3 by Stephen of Wimbledon
That was a great comment, it was at least as interesting and clear as the article.

Thank you. Given the value of the OP, I wonder if that counts as a back-handed compliment?

Another interesting comment. I want to make a longer reply but I wanted to say something first about complexity and computer science, its one of those things that academics spend a fair amount of time on (sounds like you waded through some of it) but you pretty much hit the nail on the head KISS and that even applies to the metrics themselves.

As I said one (and probably from a theoretical standpoint the best) measure of complexity are interfaces but in the real world, at least in my experience, lines of code worked almost as well and was what engineers used when they actually bothered to do metrics (which unfortunately is not nearly enough).

I had a boss who helped me appreciate the science in computer science, he was really into testing and measurement and he used to joke about the researchers who had all these complicated metrics. “Just print out the source code and plop the pages on a scale” was his approach. BTW, this wasn’t because he didn’t take it seriously, he took it really seriously he just realized that spending 100% more effort for a measure that was 1% more accurate was counter productive.

I’m really digressing, this was supposed to be the short reply, I’ll leave it at that for now, more later.

In reply to #6 by Stephen of Wimbledon:*
You then set up interfaces between these modules so that they can “talk” to each other to get things done, the key thing being that each does its job without being aware of the others and simply responds to messages (inputs).

mr_DNA beat me to it. That is what I was thinking in one of my earlier comments and that is why I thought this article might be interesting. As I think about what a biological system is it seems to me there may be ways of describing it that are analogous to a computer system. So you have hierarchical structures. The heart, the brain, etc. These are each composed of sub-components which themselves have sub-components all the way down to the cellular level. Such hierarchies exist in computer systems as well. The hospital system consists of admissions and records, billing, lab work, bio informatics, etc. Each of these systems can be broken down into sub-systems down to objects and source code.

As Mr_DNA said one way we structure systems to make them maintainable is to control the interfaces between systems, so for example that when a bioinformatics system updates outcomes data using lab results it appropriately hides the ID of the patient. It seems to me that there are interfaces between biological systems as well, sharing of information via cells, nerves, blood, etc.

So what I thought was in interesting speculation (and I acknowledge that’s all it is) is perhaps there can be some general measure of complexity that could apply to software and biological systems.

I’m not well qualified to talk about biology or computer science but I remember watching a video of Danny Hillis talking about how genetic algorithms that work can be almost impossible to understand because they just work but god knows why they work, they look so unlike what a programmer would write. The researchers from Cornell seem to be saying that placing a cost of the evolving programmes causes modules to arise. My perhaps stupid question is what is the complexity in this case? It seems to me that at a lower level the complexity is decreased because things are simpler with less connections between smaller components but that at a higher level things are more complex because we now have these modules ie different types of parts. Hope I’m making sense here and not just spouting verbal diarrhoea!

I’m not well qualified to talk about biology or computer science but I remember watching a video of Danny Hillis talking about how genetic algorithms that work can be almost impossible to understand because they just work but god knows why they work,

I think you might be misinterpreting what Hillis meant. I don’t know about the specific genetic algorithm you are talking about but I think I know what Hillis meant and its not quite “God knows why it works”. Hillis is one of the pioneers in highly distributed computing and computing using neural networks. Neural networks are a completely different programming paradigm than conventional software. With conventional software you have programming logic that takes you through various rules, conditions, etc. and gives the answer. At any point if you interrupt the program its possible to see where it is and to more or less understand what part of the logic flow (think of a flow chart) its currently in.

Neural networks are totally different. Neural networks work the way the human brain probably works. There are a bunch of software nodes and connections (if you know graph theory a network graph). The connections are weighted by mathematical formulas and those connections can grow (new connections can grow ) add new nodes and new connections and evolve (change the strength of the connections). This is (and was meant to be) a direct analog of the human brains neurons.

The amazing thing is that it actually isn’t just an interesting model to simulate the brain it turned out to be a programming technique that works on its own really well for certain kinds of problems. The problems are mostly signal recognition. So for example processing the digital information in a video feed, picking out the faces, and matching those faces against a database of terrorists would be something you use a neural network for. Other examples include processing sonar data and recognizing Russian subs from whales.

Now the “God knows how it works” I think refers to the fact that neural nets aren’t programmed the way traditional software is. You don’t just compile it, make sure it works on the test data and then go. You have to “train” it. So if you want it to recognize faces you start out with a basic network and you give it example data and you tell it “this is a face” “this isn’t”. After a while it make its own predictions but you can continue to give it feeback and it can continue to refine the way it works.

What is much harder though is for any particular decision to walk back through the program logic and see the various branches taken. Also, if you find an error in a procedural program, say its recognizing dog faces as people faces, you can go in and directly change the logic. But on a neural net its a lot harder to figure out which nodes and connections are responsible for the error and to tweak them. The standard solution is just to give it more dog faces and keep telling it they aren’t human faces.

BTW, this is an active area of research and not one I’m all that current on so there may be advancements in how to debug and refine neural nets that I’m not aware of but I’m pretty confidant that my description above is what Hillis had in mind.

The researchers from Cornell seem to be saying that placing a cost of the evolving programmes causes modules to arise. My perhaps stupid question is what is the complexity in this case? It seems to me that at a lower level the complexity is decreased because things are simpler with less connections between smaller components but that at a higher level things are more complex because we now have these modules ie different types of parts. Hope I’m making sense here and not just spouting verbal diarrhoea!

I haven’t looked at the video yet but I think I can shed light on the general question. There have been various evolutions in software programming to make it more structured. (Now I’m back to talking about traditional code although a lot of this would apply to a neural net as well). Here is a simple example to give you an idea. Suppose you have a program to compute a rocket trajectory you call the final trajectory RESULT. Now suppose as part of the computation you need a function that calculates square roots. You bring in the code from some library and in that code there is also a variable called RESULT. In the really bad old days of programming it was possible that by importing the square root code into your bigger system you could mess up the whole system because they both use the same name but with a different meaning. This sounds trivial but when you deal with large systems its a real problem. There are also problems with logic flow and other complexities but I’m trying to keep this as simple as possible.

So what new programming languages and disciplines evolved to do was to make it so that even the dumbest programmers can’t make these kinds of mistakes if they use the language and tools correctly. If you have heard of object oriented programming that is the most important goal of OO, its called encapsulation, it means your object does its thing my object does my thing and if they need to communicate they do it through interfaces.

In the bad old days it was possible for someone to modify the square root function and even if the square root function still provided the correct answer the change made to cause unintended side effects in other parts of the program. Object-Oriented programming tries to make sure that can never happen by encapsulation. You divide responsibilities up and you create various kinds of program logic. There is one set of logic that the object uses to do what it needs to do. There is another (much smaller) set of logic that are the interfaces between the object and the rest of the software world. Any other object that needs information from or wants to give information to that object works ONLY through the interfaces. That way its impossible (well much, much less likely) for a change in one object to effect another object.

By now I’ve probably bored everyone to death but its nice to have a question that has an actual answer that I’m not in doubt of for a change 🙂

He’s saying that the resulting code is unfathomable to the programmer in him but it works. The Cornell researchers to me are saying that given a cost imposed on building the programme that this type of unfathomable code would naturally be replaced by modules of code similar to what ye were saying above was a better way for people to code. Again that’s my best interpretation of both Hillis and Cornell but as I said before I’m chancing my arm even raising my voice on these topics!!! Doesn’t that mean that complexity doesn’t just arise from mutation but from a tendency for modularisation rooted in limited resources? In Cornell’s experiment the modularisation never arises from natural selection alone but does from the imposition of limited resources. If modularisation is what people mean by complexity then the complexity has limited resources as its cause and it’s fitness for purpose has natural selection as its cause. Put another way things like eyes may not arise if organisms had a limitless bank balance. If I’m misunderstanding all of this don’t hit me too hard guys – I’m an amateur!!!

I’m not well qualified to talk about biology or computer science but I remember watching a video of Danny Hillis talking about how genetic algorithms that work can be almost impossible to understand because they just work but god knows why they work,

Obviously its silly to say that pampering fruit flies has removed them from natural selection – they still experience some level of gravity, some kind of air pressure, some diet, some humidity etc. so that you have only changed the selection pressures.

But in a computer simulation if you evolve an organism you can remove all but one element of natural selection. You can have a single test of fitness e.g. ability to sort numbers. If you run the same experiment twice and place limited resources on one experiment and you don’t place it on the other and if the limited resources experiment is always the one to result in modules forming then you can reasonably say that modules are caused not by natural selection but by limited resources. And if modules equate to complexity then complexity is caused by limited resources.

He’s saying that the resulting code is unfathomable to the programmer in him but it works.

Thanks for the link, interesting little snippet. So now I see what you mean and I agree with your interpretation. Genetic algorithms are very much like neural nets in that they aren’t programmed conventionally but work by refinement and example. The resulting programs will be complex to understand for a human programmer because the code is not designed by a human but rather by what I guess you would call un-natural selection, the programmer just produces lots of random iterations of code (most of which don’t work at all) and keeps the few that work.

However, I think this example is kind of a distraction from what we are talking about because unlike neural nets, object oriented programming, and software components genetic algorithms never really have proven to work on real world problems that I’m aware of. They were an interesting research initiative but kind of a dead end and most of the people who worked in that area have now gone over to neural nets. I know he talked about how they use them at Thinking Machines but I would like to know what specifically he has in mind because I’ve never known of a practical application for the idea that really worked aside from a prototype.

BTW, the whole design of the Connection Machine, the flagship product that Hillis’s company created was optimized for neural net systems. Originally such systems were just simulated on conventional computers, the highly distributed highly connected architecture of the Connection Machine was designed to run systems like that much faster.

I don’t think the Cornell researchers were talking about genetic algorithms at all but I only glanced at the article, will look at it in more depth when I get a chance.

Hi mr_DNA,
What both you and Red Dog appear to be saying is that rising complexity can be defined as a rising number of components in a system.

Just to be clear, not just the components but the connections between them. And my long digression was about a technique to organize systems into hierarchies and in so doing to reduce the possible connections between components. In a nutshell thats the main idea of encapsulation, by constraining how components can communicate we reduce the possible connections (interfaces) which makes the resulting system more testable and manageable.

A quick question for clarity….imagine 100 points on a map. imagine each point is connected to the other 99 points by a line (say representing a road). Now imagine a second map with the same 100 points marked but 10 groups of 10 points are internally connected with then 10 major roads connecting each group (say representing motorways). Which road system is considered the most complex? I would have said the second system of hubs. Am I going astray?

A quick question for clarity….imagine 100 points on a map. imagine each point is connected to the other 99 points by a line (say representing a road). Now imagine a second map with the same 100 points marked but 10 groups of 10 points are internally connected with then 10 major roads connecting each group (say representing motorways). Which road system is considered the most complex? I would have said the second system of hubs. Am I going astray?

No you are thinking of it the right way Let me try to refine your question to show why I think the second graph is simpler at least in a way that matters in computer science. Imagine that each of the links is weighted with a number representing the cost to traverse that link.

Imagine if you want to find the shortest path between two specific nodes. In the first graph where everything is connected to everything you have a lot more possible paths, my graph theory is atrophied but I think the number is 100! With the graph that has ten clusters where each node in the cluster is connected but each cluster is only connected to the other clusters the number is much smaller, I think it would be 3 times 10! (once for the shortest path in each cluster to the main node and once for the shortest path connecting the clusters). Some CS student is probably going to tell me my numbers are all wrong but the important thing is its the possible connections that have to be considered that is the issue and that is why the modular approach is considered better.

Am I missing something? I always knew mutation was the cause of variation, and essentially the source of all the complexity. Natural selection ( and for that matter, any selection) is the reason why some variations make it and some don’t. It’s the very reason there isn’t a whole bunch of cyclopes and 1 1/2 legged tortoises. It’s in fact the antithesis of variation.