Last week, I had the dubious pleasure of revisiting some work I did over three years ago. Back then, as the Census of Marine Life was in its final stages, I got together with Edward Vanden Berghe, then managing the Ocean Biogoegraphic Information System (OBIS), to investigate the suspicion of CoML senior scientist Ron O’Dor that surveys of marine biodiversity largely overlooked ‘the big blue bit in the middle’ – the deep pelagic ocean, by far the largest habitat on Earth.
The idea that Edward and I hit on was to use OBIS to produce a plot that would show if Ron was right. OBIS contained at that time around 20 million records, with each record representing the occurrence of a specific species in a particular location. Only around 7 million of these also recorded the depth at which the species had been recorded, but by comparing the depths of these 7 million samples with a global map of ocean depth, we were able to place each of them at a position in the water column. As you can see (and as we showed more rigorously in the resulting paper), Ron was right: in all regions of the ocean, biodiversity records from midwater are far less common than those from the sea bed, or those from surface waters.

Fig 1. Global distribution within the water column of recorded marine biodiversity, using approximately 7 million occurrence records extracted from from OBIS in 2009. The horizontal axis splits the oceans into five zones on the basis of depth, with the width of each zone on this axis proportional to its global surface area. The vertical axis is ocean depth, on a linear scale. The inset shows in greater detail the continental shelf and slope, where the majority of records are found. Note this is slightly different from the version previously published, as it is scaled to the 2013 range of data.

We discussed the implications of this chronic under-sampling of the world’s biggest ecosystem in our paper, but when talking about this work I prefer to quote from another paper, by Bruce Robison:

The largest living space on Earth lies between the ocean’s sunlit upper layers and the dark floor of the deep sea… Within this vast midwater habitat are the planet’s largest animal communities, composed of creatures adapted to a… world without solid boundaries. Thes animals probably outnumber all others on Earth, but they are so little known that their biodiversity has yet to be even estimated

Since our paper came out, I have continued to use OBIS data in my research attempting to describe and explain the distribution of diversity in our oceans. At the same time, OBIS has changed too, both structurally - it’s moved from Rutgers in NJ to the IOC Project office for IODE in Ostend - and in terms of its content, now housing over 35 million records, including almost 19 million which recorded sample depth.

So back to last week, and an email from the current manager of OBIS, Ward Appeltans, asking if I might be able to update the figure from our 2010 paper with new OBIS data.

With some trepidation, I opened up the file of R code I’d used for the original analysis. And got a pleasant surprise: it was readable! Largely this was because I submitted it as an appendix to our paper, and so had taken more care than usual to annotate it carefully. I think this demonstrates an under-apprecaiated virtue of sharing data and code: in preparing it such that it is comprehensible to others, it becomes much more useful to your future self. This point is nicely made in a new paper by Ethan White and colleagues on making using and reusing data easier.

So, rather than days of fiddling, I was able to get the code up and running with new data really quite quickly. Of course, there were a few minor bugs to sort out - one thing I always do with R code now, but didn’t at the time, is to insert the command rm(list = ls()) at the top, to clear my workspace. The fact my old code didn’t work immediately was, I think, down to my failure to do this - the code required an object that was clearly hanging around in my workspace at the time. But it was simply a matter of correcting a name inside a function and it all worked fine. (Actually, one thing still doesn’t work well, which is getting the figure from R into a satisfactory, scaleable vector format which looks nice in other packages - the PDF looks OK (but not great) in Preview but awful viewed in Acrobat, for example - but that’s another story…)

What happens, then, to our view of the depth distribution of marine biodiversity knowledge when we increase the number of observations from 7 million to 19 million?

Fig 2. Figure 1 updated to use the c. 19 million suitable occurrence records available in OBIS in 2013.

Actually, rather little: the overall pattern is pretty much the same, with far more records from shallow than deep seas, and a paucity of midwater records at all depths. The big blue bit in the middle remains both big and blue.

Postscript: Ward at OBIS emailed me to suggest that this post comes across a bit on the negative side, which was certainly not my intention. Even back in 2009 OBIS was a phenomenal resource for marine biodiversity research; the fact that in under 4 years, the number of useful records for my analysis has increased >2.5x is amazing. My view is still that the big blue bit in the middle remains both big and blue, but it's very heartening to see the fingers of yellow extending further and further into the colossal deep pelagic ocean. It would be nice to think that our data visualisation exercise has had something to do with this!