The Scientific Method, Exemplified By Palaeontology

This talk is aimed at the basic, general public level, and is about how science is done through the scientific method, and how to differentiate ideas that belong in the yellow circle downwards from the ideas that are in the nebulous circles outside. To save it from being a very dry, boring, philosophical talk, all introduced concepts will be exemplified using my own science, that of palaeontology.

The talk is very simple with only four parts. First is a definition of what science’s aims are. Then we will look at what makes an idea a proper scientific hypothesis, then we’ll see how hypotheses are tested to confirm or reject them. Finally, all we’ve seen will be summarised in order to demonstrate why science has been so successful since its inception.

I took this definition of science’s goals from Carl Hempel‘s excellent 1965 book, Aspects of Scientific Explanation. Science has two goals: to describe and to understand. In other words, the scientist gathers observations (description) and tries to find hypotheses (theoretical understanding) to explain those observations.

Basically, all of science can then be boiled down to why questions. “Why do I observe this, and not that? Because of this hypothesis.” For example: Why do we observe humans alive today, and not Neandertals? Because humans managed to survive the climate changes; or humans killed all the Neandertals; or whatever other hypothesis comes to mind.

As an example of the interplay between observation and hypothesis, consider cetaceans. If you dissect any cetaceans from any species, any gender, and any life cycle stage, you will notice that they all have these tiny bones in their abdomen that don’t have any conceivable function. So as a scientist, you have to ask yourself why they’re there.

So you start hypothesising. Maybe the first hypothesis that comes to your mind is that these bones are just pieces that, for some reason, break off from the spine or the rib. But additional observations don’t match up with that hypothesis: the bones don’t look like spine or rib bones, and they don’t develop that way anyway.

Then you notice the position of the bones and think to yourself that they’re in precisely the spot that hindlimbs would be. So you hypothesise that they’re pubic and femoral bones. The problem with that hypothesis is that they don’t have the proper attachments: they’re not attached to themselves or to the rest of the skeleton, as pubic and femoral bones should be.

But then you notice the size of the bones, and another hypothesis pops into your mind. However, biology using modern organisms will only take you this far. To test your new hypothesis, you need to look at palaeontology.

This hypothesis is that they’re vestigial pubic and femoral bones, i.e. that they got reduced through evolution because they no longer had any use (think of cave fish and how they become blind over evolutionary time). In order to test this hypothesis though, you need to look at the evolutionary history of your taxon, and the only line of evidence that directly preserves the evolutionary history of any taxon is, of course, the fossil record.

The slide summarises the major points of the fossil record; look at this post for some more information. At the beginning, 52 Ma, cetaceans were completely terrestrial dog-like animals. Gradually, they became more and more amphibious and eventually became fully aquatic, as demonstrated by the last common ancestor of the whales, Basilosaurus.

If you look at the hindlimbs, you’ll also notice their evolution: as soon as the animal became aquatic and no longer used the hindlimbs for locomotion, they immediately became small (no doubt because having useless legs flopping around would also be an impediment to the aerodynamics of the animal).

In summary, your hypothesis is now supported by your observations of the fossil record.

Hypotheses can’t just be any idea that plop into your head. They have to fulfil certain criteria. The four most essential ones will be introduced here.

The first criterion is rationality. Your hypothesis has to make sense. I couldn’t think up of an example from real palaeontology that was irrational (creationist poppycock doesn’t count), so I made one up: the hypothesis that trilobites are made of cheese.

This hypothesis makes no sense at all. Cheese is mostly-edible rotten milk. Trilobites are now-extinct arthropods that, when alive, had a calcitic exoskeleton, and are now known only as rocks. Rock that are not made of cheese. The hypothesis simply does not compute. And even if you had to test it, a quick geochemical analysis shows you the chemical composition of your fossilised trilobites. Heck, a taste test could confirm that they’re not made of cheese.

The next criterion for a hypothesis is truth. I don’t want to get into any philosophical wankery about truth is and what truth means; what’s meant here is that your hypothesis has to conform to the most basic factual standards that we have. As an example, I chose a monumental screw-up from Cyprus’s own Ministry of Education.

The pictured page is taken from the new school year’s 7th grade (12-13 year olds) biology textbooks for the public schools, and it brings forward a hypothesis for the origin of modern biodiversity: some dude called Noah built a big boat and brought all organisms on it, to save them from a global flood that lasted 40 days. He then released them. This is, of course, the myth of Noah’s Ark from the Book of Genesis, part of the Christian mythological canon.

This does not conform to the truth criterion for way too many reasons to list. We know that this is nothing more than a fable, plagiarised from earlier fables from the Babylonians and Sumerians (see here). We know that the story is logistically impossible. Building the boat is one story, gathering up all biodiversity is another, and both are as impossible as each other (how did he plan on making sure he got every single bacterium without a microscope?). We know that there was no global flood at any relevant time (there was one large local flood, which was probably the original inspirations for the story). And, most importantly, we have the entire fossil record telling us that the origin of modern biodiversitydid not involve a boat, either literal or metaphorical (there was no single center of origin for modern biodiversity).

In other words, the story is complete bunkum and fails as a hypothesis because it completely ignores established fact. If, like any reasonable person, you feel a sense of outrage at such bullshit being taught to 12-13 year olds, feel free to send an e-mail to our ministry of education. Here is my own letter, with personal intro and without personal intro. You can also sign this petition (Greek) that we at Cyprus FreeThinkers set up.

The next criterion is objectivity. What’s meant here is that your hypothesis has to be based on lines of evidence that don’t just come from the subjective melting pot that is your brain. It’s a big problem in palaeontology, given the nature of fossils.

The example shown is that of the Apex Chert “microfossils”. The validity or not of their interpretation is not what’s underscrutiny here, just the strength of this one piece of evidence, taken from one of the original papers where they were presented. See this post for more about them; basically, the Apex Chert comes from a 3.465 Ga locality in Australia with these strange structures preserved in it, and these have been postulated as being the earliest body fossils of microorganisms (they have a mention in every relevant textbook or popular book).

Consider this one piece of evidence for this hypothesis presented up there: pictures (objective evidence) and drawings (subjective). The pictures are fine, even if not very informative. The drawings, however, are not very solid – they are too subjective and show only what Professor Schopf saw in these structures. If I were to draw them, I probably wouldn’t see the same things. As I said, this is a problem in a lot of palaeontology, especially when dealing with such old and enigmatic stuff. The surefire way of supporting your drawings is finding intruments that support your interpretations – pictures, 3D models, geochemicals analyses. But drawings alone are not sufficient, at least not without copious amounts of justification.

The last criterion is realism. A hypothesis has to be realistic. I couldn’t think of a real palaeontological hypothesis that’s not realistic, so I turned to a notorious lunatic from the internet, Wretch Fossil. I knew him from lurking around Usenet, and he also has a website (the one on the slide is his old domain, doesn’t work anymore).

His MO is rather simple. He looks at thin sections of meteorites and moon rocks, and misinterprets what are basic mineralogical and petrological structures as animal parts (look at the captions in the screenshot up there: neurons, brain tissue, blood vessels, meat).

If Wretch Fossil was smart – a rather difficult thought experiment to go with, but bear with me – then he would have misinterpreted these structures as “microorganisms”, or something similarly vague. That would have fulfilled the minimal standard of realism, since it is not entirely unplausible for biology at a microscopic scale to have happened somewhere else in the universe (I would still call him a crackpot, but someone with an investment in astrobiology might give him some benefit of the doubt).

However, his critical error is saying these structures are animalian – and brain tissue, blood vessels, neurons, and meat are distinctly, autapomorphically animalian structures, not found in any other taxon. The reason is that animals, the Metazoa, are just a single branch of evolution, one of over 50, and it undoubtably originated right here on Earth, as all sources of evidence tell us. Heck, we even have good glimpses of their early fossil record (with major gaps, of course). In other words, postulating that this one single branch of evolution can be found in space is quite simply insane, even if convergent evolution is brought into play.

One thing that you will never see is any sort of minimal boundary for how realistic a hypothesis has to be to be acceptable. The reason is that in science, you never deal with absolutes of certainty, just levels of certainty. This is in contrast to maths, where you do have a concept of proof; in science, all you have are probabilities and likelihoods.

To demonstrate this, look at the four fossils on the slide. The ones on the left and in the middle are very obviously snails, as you can tell from the coiled calcitic shell. In colloquial terms, anyone would say these are “definitely” snails. In purely scientific semantic terms, one would say these are 99.999999% snails, leaving a 0.000001% opening in case future systematic changes disrupt what we currently characterise as “snails” (it happens to even the most iconic groups; think “Reptilia“).

Now look at the two on the right. Above is Acaenoplax from the Silurian Herefordshire locality, described by Sutton et al. (2004) as a relative of the Aplacophora, wormish molluscs without a shell. However, it’s not so hard to reinterpret it as a polychaete worm (Steiner & Salwini-Plawen, 2001). What this means is that you can only say that this is “probably” a mollusc, but not with any higher level of certainty; if you’re confident in your analsis, you would say it’s “likely to be” a mollusc.

The fossil below is pretty much the same story:the EdiacaranKimberella. Current concensus places it as a bilaterian animal due to the bilateral symmetry of the body fossil. Trace fossils associated with it have allowed a further consensus to form that Kimberella is a stem-group mollusc. However, nobody will say that “Kimberella is definitely a mollusc”, because that wouldn’t be intellectually honest.

The fact that science only deals with levels of certainty and not in absolute proof underlies what is one of the most powerful concepts of the scientific method: that of falsifiability. For a hypothesis to be considered scientific, it has to be falsifiable, i.e. you must be able to show it to be wrong. By definition, every single hypothesis fulfils this criterion.

To demonstrate it, we’ll look at the anomalocarids, stem-group arthropods characterised by their large eyes, pineapple mouths, and great appendages. They were the very first apex predators back in the Cambrian, and reached a reasonable diversity – we know of three complete body fossils and a dozen or so isolated appendages.

Anyway, up until five years ago, anomalocarids were known only from the usual Cambrian Lagerstätten – Burges, Chengjiang, Sirius Passet. In addition to the fact that a small mass extinction apparently occurred at the end of the Cambrian, it was most reasonable to hypothesise that anomalocarids died out at the end of the Cambrian.

But then this fossil was discovered from the Hunsrück Slates: Schinderhannes bartelsi. The Hunsrück Slates are Devonian in age, having been dopsited 405 Ma – that’s 100 million years after the last anomalocarids of the Burgess Shale. And Schinderhannes is a clear-cut anomalocarid.

Therefore, the hypothesis that anomalocarids died out at the end of the Cambrian has been falsified.

The key to falsifying a hypothesis is to always test it. Hypothesis testing is just about what a scientist does all day when working. As an example, imagine you’re digging around in Carboniferous sediments and you find this spectacular insect fossil, which you immediately recognise as a protodonate.

You measure it and notice it’s 70 cm big – the largest insect to have ever lived,a s far as we know. This is a far cry from the largest insects nowadays, which are 16 cm big beetles. So as a scientist, you have to hypothesise about what made these insects grow so large. You know from geochemistry that that the oxygen levels back in the Carboniferous were much higher than today, so you hypothesise that the oxygen levels had something to do with the large size, knowing full well that insects breathe mostly through diffusion because of their tracheal system.

You then have to test this hypothesis, and there are two ways. The first is to look at the fossil record of insects and see if their general size changes correlate with oxygen level changes (they do, roughly). The other way is to apply the uniformitarian principle, that processes happening today are also happening in the past (the laws of physics haven’t changed, etc.). In this case, you can raise insects in artificially oxygen-enhanced atmospheres in the lab (and you notice that theydo get bigger).

From there, you can then hypothesise about how these enormous animals lived. They have strongly-supported veins, so you hypothesise that they were agile predators. Testing this hypothesis can be through modeling (CFD on your computer, scale model, a robot), or by looking for modern analogues and comparing (dragonflies).

Once your hypothesis is tested with positive results, you can the go to a conclusion by inference. There are three types of inference used in science, all of them known since Aristotle and probably from before: deducation, induction, and abduction.

Deduction is when you have a set of statements, and your inference is a logical follow-up to those statements. “Humans are mortals. I’m a human. Therefore I’m mortal,” is the classic example. As long as your statements are true, your deduction will also always be true.

Induction is basically prediction based on previously-known facts. The classic example is from the periodic table in chemistry: if you know the properties of calcium and magnesium, you can derive the properties of barium or strontium, for example. Palaeontological examples come a dime a dozen. For example, all known sauropods are enormous (with the exception of island sauropods), so by induction, you can predict that any sauropod you find in the future will also be enormous, and you then infer that gigantism is therefore a sauropod trait.

Abduction, while also known for a long time, was formalised only at the start of the 20th century by Charles Sanders Peirce. The way it works is if you have an observation that can only be explained by a certain hypothesis, then that hypothesis must be valid by virtue of that observation existing.

The classic example comes from the discovery of the cause of the K-T mass extinction, the one that killed off the non-avian dinosaurs. The picture shows a picture of the Fish Clay layer from Stevns Klint, Denmark. Stevns Klint is a cliff that preserves Cretaceous layers at the bottom and Tertiary layers at the top, with the Fish Clay in between. If you were to do a geochemical analysis of this Fish Clay layer, you’d notice an enormous spike in iridium. Iridium is an element not produced on Earth, only found in space.

The Alvarez father and son team, back in the 1980s, discovered this iridium anomaly in Italian rocks from the K-T boundary and, by abduction, came to the only logical conclusion: that the iridium must have gotten there from space, probably by asteroid impact. A huge controversy ensued, but the abductive inference was solid – and later vindicated with the discovery of the Chicxulub Crater.

While nobody can generalise how science is done with every person and in every lab, what can be done is tracing the ideal way a research project would go through, from the inference standpoint. The first step is to look at what it is you’re researching and, by abduction, think up of all the hypotheses that could possibly explain your observations. You then distill your research observations to the most basic facts and, by deduction, think of what logical conclusions those facts lead you to. This will tell you how to test your hypotheses. Finally, you do the science and test your hypotheses, which will allow you to come to conclusions about whether your hypotheses are right or wrong, by induction.

For example, imagine you’re digging around the Messel pit, and you find a fossil of two turtles arranged in this peculiar position.

You have to think of hypotheses to explain this position, by abduction. Your null hypothesis would be that this is coincidental – they happened to be like this and they died. But then you notice the size difference, and you can hypothesise that maybe this is some kind of social clue: maybe parent-offspring or maybe sexual dimorphism.

By deduction, you distil everything down to statistics, in this case measurements. You find more similar fossils and measure them all. You notice a consistent dimorphism in the tails.

Such dimorphism is a classic, tell-tale sign of sexual dimorphism. And so, by induction, you come to the conclusion that your turtles are representatives of male-female pairs; the one pictured first was probably in the act of copulation, given the position.

The final point about inferences that one must keep in mind is the nature of the evidence on which inferences are based. The evidence must be relevant. This may be an obvious statement, but it’s an important aspect to keep in mind at all time, especially in palaeontology, where every single fossil has many stories to tell. I don’t mean to be poetic, but it’s true that any palaeontologist worth his salt will be able to tell you amazing narratives based only on a single fossil.

Consider that random ammonite I pulled off the internet, and think of just how much disparate evidence there is in that single shell.

Here’s a small selection. One can get stratigraphical evidence from it – where it was found, and use it to date other layers where it’s present. One can stick it in a synchrotron or µCT and get the detailed structure of the mouthparts. One can place it in the grand scheme of ammonite evolution, or use it to reconstruct the life cycle of its species (in combination with other conspecific fossils), or it can be used to place more detail in the phylogeny of the ammonoids. Finally, its ecological position can be inferred.

But consider what evidence is used to get each of that information. The structure of the mouthparts will inform you about the ecology. The ecology will give you information to enable you to better interpret the stratigraphy. But the stratigraphy will not help you at all with reconstructing the life cycle (stratigraphy deals with geological time, life cycles with biological time). Many life cycles might inform you about evolutionary trends, but the evolutionary trends would never, ever be able to be reconstructed without a solid phylogenetic framework. This framework might use mouthpart information, but it will not help you in interpreting your stratigraphy.

In other words, there is an entire web of evidence, and while there are tentative strings connecting them all, some of these strings are nearly invisible and way too fragile to be used. The very best scientists are the ones who can combine disparate types of evidence in novel ways to come up with innovative concepts.

What has been said so far are the basic generalities of how the scientific method works in practice. We will now see why those generalities have combined to make such a powerful tool.

The first reason, harkening back to the slide about the goal of science, is that science doesn’t allow for miracles. Everything in science can be explained – that’s the goal of science. Consider this painting of the Baltic Amber forest up there. Nothing there is made up by the artist. The mantophasmatodean in the foreground is known to have lived in the Baltic Amber forest. We know that praying mantises could have been caught in amber resin. We have a fairly good idea that those painted trees could have produced this resin. We know that the Baltic Amber forest was this kind of environment. That lemur whatever animal on the tree, and the emu-like animal in the background, both are known to have existed there.

The existence of these freaky animals in the Baltic Amber forest wasn’t miraculously divined by Richard Bizley, nor did he make them up. We know they were there, due to the application of the scientific method.

Painting:Richard Bizley; to be released in David Penney’s 2013 book, Fossil Insects.

The second reason why science is so successful is because science is self-correcting. Hypotheses and inferences have to be, by definition, falsifiable, so anything in science can potentially be wrong. But the only process that will uncover and correct these “mistakes” is more science.

Consider the stylophorans, a bunch of extinct echinoderms. For a logn time, three hypotheses for their functional morphology and systematic position have been around. The first one (C and b in the diagrams) is that it’s some form of stem-group echnoderm. The second one (B and d) is that it’s a relative of the crinoids. The third (A and c) is that they’re calcichordates, a hypothesised grouping of the Chordata and Echinodermata as sister clades, with the stylophorans being the last common ancestor. This obviously would have an enormous impact on how we view deuterostomian evolution.

The view was tenable and supportable – a notochord in the stalk, and area at the front was occupied by a brain and organs. However, recent fossil finds and associated trace fossils have shown that the proposed morphology and behaviour of the stylophorans isn’t compatible with the fossil record, and so the calcichordate is discarded. Science has corrected itself.

A natural consequence of the self-correcting property of science is that science is always evolving by discovering new things or reworking old knowledge. Science will never, ever be static, by definition.

For example, consider the origin of birds. It’s something that biology with extant organisms could never, ever throw light on. But the discovery of Archaeopteryx (and other theropods) and its correct interpretation provided a huge revolution in the study of both birds and dinosaurs.

In the past couple of decades, China has proven itself to be the most exciting place for palaeontology, in no small part due to the amazingly-preserved feathered dinosaurs found there (many more unpublished fossils still lie in archives or undiscovered).

These new, spectacular fossils give us unprecendented insights into the early evolution of birds, not only among themselves (as in the phylogeny above), but also their evolution from their theropodan ancestors. This is how science always advances – the individual steps are incremental (individual fossil finds), but gathered together, you get an ever-shifting landscape of discoveries.

And finally, the final reason why science has been so successful is precisely because it’s been successful. A tautology, but it’s true. Think of every single advancement in the history of human culture. Now I challenge you to name one that wasn’t the result of science (even before the scientific method was formalised, science existed on the intuitive scale).

That’s right. You can’t name one. The agriculture revolution, from the Middle Ages to today – all the work of science. The medical revolutions, all the work of science (and the wrong ones, all corrected by science). The technological revolutions, again, all brought forward by science. The fact that you can read this post is purely because of science, nothing else.

And I’ll take this one step further. Any modern scientific revolutions would not have been possible without the application of the scientific method in palaeontology. The reason is oil. No matter what your stance on environmentalism is, the there’s no use denying that our entire world runs on oil. If you want to deny that, shun the use of plastics (including medicinal products and apparatus). Don’t live in a building, since buildings are all built with oil-run machines. Just go into a cave and live as a hermit.

Oil is all micropalaeontology, not just in its composition, but in prospecting for oil. Long gone are the days when you can dig an oil well with a shovel – we’ve exhausted all of those. Now, all the oil is wither buried deep, or well-hidden. And the only way to get to it is by trusting a micropalaeontologist to use the inferential methods outlined earlier to properly get the layout of the underground, and guide the drill straight into the oil reservoir. Otherwise, he’s cost the oil company millions of dollars, and is doomed to a future of selling hot dogs.

That does it for the talk. Just a bit of advertising for me – you can help me in applying the scientific method, by helping support my research project on Petridish.org. Share it around your social networks, and consider donating if you can! See here for more scientific background info.