Main menu

Post navigation

My lab’s new lab notebook backup system. Part 2: The system

In yesterday’s post, I talked about my motivations for seeking a new system for backing up lab notebooks and data sheets. Here, I describe the system we’re now using for backing up lab notebooks, data sheets, etc. I think it’s working well. At the end, I ask for suggestions of systems that work for backing up files on lab members’ laptops, which I think we could do a better job of.

The system

I bought a lab iPad. One of the concerns I’d had with our old system was that it relied on people in the lab using their own smartphones. Now, there’s a lab iPad for everyone to use. It’s in a handy blue case that hopefully will protect it to some extent; our lab uses lots and lots and lots of water!

The iPad I bought is just a wifi-enabled one. But, if our fieldwork wasn’t local, I might have gone with a cellular-enabled one to allow easy data backup in the field.

We created a lab AppleID and linked that to the iPad.

We created a lab Google Drive account and linked that to the iPad. We also considered a lab Box account, since Michigan has links with both Google Drive and Box. They seemed roughly equivalent to me, and I decided to go with Google Drive for no particular reason. This required help from my IT folks so we could have it as an @umich.edu account.

I loaded a scanner app on the iPad. I used the ScannerPro app, based on Elizabeth’s recommendation, though someone else recommended CamScanner, which seems pretty equivalent. Other notes:

I didn’t want to link my credit card (or anyone else’s!) to the lab AppleID, so I gifted the ScannerApp (which was only a few dollars) to the lab AppleID from my personal account. (Many thanks to Elizabeth for suggesting this workaround!)

We set the ScannerApp up so that files that get scanned automatically connect with our lab Google Drive account. Each lab member has a different file, and they can add new “pages” on to the end of their lab notebook as they do more work. An advantage to this system is that datasheets can be integrated right into the lab notebook file in sequential order. It would also be possible to add in, say, a picture showing an important observation or phenomenon.

We trained the lab in how to use it. My technician Katie was the first one to really use ScannerApp, scanning in various lab notebooks. She scanned in various lab notebooks, then gave the lab an overview at a lab meeting:

iPad showing view in ScannerApp showing the different lab notebooks that have been scanned.

Katie Hunsberger demonstrated how to use the iPad and ScannerApp to back up a lab notebook

I check in with the lab about whether they are updating their files. I am planning on making this a section of the mentoring plans that I do with the lab. We update those three times a year but, based on travel and other unusual circumstances, our summer updates are somewhat delayed this year. So, I recently emailed my lab members:

Hi all,

I’m writing to check in to make sure you’re all backing up your lab notebooks, data sheets, and electronic data files. We have a lab iPad and scanner app that should make backing up of lab notebooks and data sheets straightforward. If you don’t know how to use that, please let me know and we’ll help you out!

Please reply to this email to let me know:

If your lab notebooks and data sheets are all currently backed up using the lab iPad + scanner app + google drive system.

If the answer to 1 is no, please let me know when you anticipate having everything backed up by and if there’s something you need help with.

If all your excel files, R code, etc. that you use for data analysis, word files with manuscript drafts, etc. are in at least two locations (one of which can be the cloud). The system I use for my files leads to them all being on my laptop, desktop, and in the cloud. Ideally, you’d also have a way of retrieving an older version of a file in case something gets messed up.

If the answer to “3” is no, let me know if you need help with setting up a system and what your plan is for getting things backed up. Hard drives can be expected to last 3-5 years (plus laptops are prone to being stolen or misplaced) so it’s super important to have everything backed up!

I think that some people will keep things backed up well without reminders, but that others might need more prodding. So, I’ve set that email to reappear in my inbox in 3 months so that I remember to check in with everyone (plus plan on checking in when updating mentoring plans, as I said above.)

Future directions & request for suggestions

I think this new system is sufficiently user friendly that, with periodic check-ins to remind people about the importance of backing everything up, I think this system should work for backing up the lab notebooks and data sheets. And the lab computer (which has all our photomicrographs on it, along with several really important data files) has a good backup system on it, including a system (CrashPlan) that archives past versions of file.

The main thing that I want to improve in the future is the system (and culture) we have regarding data files, R code, manuscript files, etc. that people work on on their personal computers. Right now, my lab is using a hodgepodge of approaches – some rely on their Dropbox accounts, some use GitHub, and several rely on external harddrives for backups (though, with the latter, if the external harddrive and laptop are in the same place for a non-trivial amount of time, water damage or fire damage or theft to/of one might easily occur to the other, so I don’t view this as an ideal backup system).

So, this post was motivated both by 1) wanting to share a lab notebook backup system that has worked well for my lab in case it helps other labs (I’ve seen others ask on twitter, too!), and 2) wanting to find out how other labs deal with the files that are on lab members’ laptops. I’d love to hear ideas!

I actually don’t even use datasheets anymore, nearly all data is collected on my smartphone. No need for datasheets anymore! And no need for data entry! The only thing to do is audit data (which we do extensively) but so much time has been saved.

If you are worried about the iPad surviving around lots of water, there are completely waterproof (fully submersible) cases available from lifeproof. A friend has been using a bunch of these (I think they have like 5 or 6 iPads deployed) around their aquatic lab for a few years now, and they have yet to have one fail from water damage.

Have you considered connecting your drive to an Open Science Framework project (https://osf.io) so that as members transition off the projects their data remains in a centralized place? OSF allows you to connect various drives (Google, Amazon, Box, Dropbox, Github, etc.) dynamically to a single project so you don’t have to move everything. Happy to help you investigate if you’re interested.

For those who do a lot of idea sketching [not just data recording] with pen and paper, a backup app that allows you to search your files for words (like ctrl-f but for handwritten words) can be useful. It can make retrieving backed up notes easier. Such apps are usually not free though. Evernote is one of them. I used it a lot, but ultimately I stopped paying for it when I started coding more frequently than writing equations with pen and paper.

Two things have really changed my research life over the past five years: the first is embracing GitHub for version control of code, data, manuscripts and my research groups individual and combined science and the other is switching over to electronic data collection. For ecologists who haven’t made the switch from paper field books to iPads and electronic data collection it is not as scary as you might think!!!

Electronic data collection can be more rigorous with error checking as data are collected to prevent mistakes. Data can be better backed up. And finally it forces us to put thought into the structure of data before we collect it (significant digits, continuous or categorical data, are the data unrestricted or constrained to a particular range or particular set of values, etc.), which helps down the road when it comes time for analysis. Electronic data collection has saved days, if not months, of data entry each year for my team and has allowed us to go from ecological monitoring in the field to analysis of results within hours instead of days. Our work flows are streamlined and our iPads are waterproof, so data collection can occur under any conditions – and we work in the Arctic, so we experience it all from wet to dry, hot to cold, rain, snow, you name it.

In 2015, my research group moved to using iPads for data collection. At first I was skeptical. I had used electronic devices in the past with mixed success and had given up, but of course technology had progressed and working with multiple devices and syncing data is much easier now. So, my PhD student and postdoc convinced me to loose the field note books (we still carry them, but now they are for our personal notes rather than the main data collection efforts) and go digital. Here is our electronic data collection workflow:

Each member of the field team has access to an iPad for data collection. We use the Numbers app or the sheets app on iPads to enter data (thus skipping the paper and pencil field book stage). Before heading out to the field, we prepare our spreadsheets, which also helps us to think about what data we should be collect specifically and what is the best format format to collect those data. We have waterproof cases for the iPads, so we can use them under any conditions including quite heavy rain. Entering the data straight into a digital format helps avoid transcribing errors when copying information over from field books to spreadsheets, and it means we can go through lengthy and detailed field protocols much quicker.

For example, for our long-term vegetation monitoring, instead of having to manually write down each species name and how tall the individual is (for 1200+ records), we have a drop down menu with species names, ordered by commonness, and when the observer says “Salix arctica, 42.7, three leaves“, the recorder can select the species from the drop down list, note the individual’s height numerically using the appropriate decimal places and then increment a count for the number of leaves sampled. This speeds up the data collection in the field and removes the need for data entry afterwards.

In the field, we take photos of our experimental plots, species that we can’t ID, and the field team in action that can be quickly associated with the data themselves. We also take regular screen captures as we are working, providing a digital backup during data collection. At the end of the day, we sync our iPads with each other and with multiple other devices including computers and hard drives (and that is without internet in our Arctic field sites, this an even quicker process, if you have access to the internet) and as soon as we are back to the internet world, we sync our iPads to the cloud. We use version control when appropriate including version numbers in file names. Thus, all our data are backed up and synced, but also integrated into our data handling workflows so that we can for example save files as .csv and import into R for incorporation into the master dataset and proceed with data checking and analysis.

Combining these practices with version control using GitHub dramatically increases the speed at which we can carry out laborious field monitoring protocols and update analyses with the new data. Of course we quickly spend all of the time that we have saved on trying to improve our statistical modelling approaches or trying to build even larger datasets, but I think the result is more rigorous and exciting science.

So, if you haven’t tried completely electronic data collection, I say take the plunge and give it a go! Four years and counting, and I haven’t looked back yet. And, if you want to find out more about GitHub for research groups check out our Coding Club tutorial on the subject:

One thing I wondered when reading it: how do you set up the forms in the first place? I suppose I need to spend more time checking out numbers and sheets. It sounds like you can basically create a system that’s like a Google Form for your data — is that right?

Yes! Pretty much any type of sheet design is possible probably. We haven’t explored the full range of data entry options ourselves. Mostly we use a combination of free-form entry of numbers and text, drop down menus (particularly useful for species lists) and some different types of counting cells. We have used both numbers and sheets and don’t have a strong preference – they both have similar functionality. Spending a bit of time to get the design right for a data entry sheet can save lots of time out in the field, so is well worth the effort!

Copyright

(C) 2011-2018 by the author of each individual post (specifically Jeremy Fox, Meghan Duffy, Brian McGill, or as otherwise noted at the top of each post).
The copyright holders have made these posts available on the Dynamic Ecology website at the present time for reading and commenting to benefit the scientific community. Hypertext links to posts which transfer readers to our website are also welcome. However, the authors retain all other rights to the posts including the rights to republish elsewhere and to charge for access. The authors also prohibit other uses including copying or republishing entire or substantial portions of posts without the author's permission, but do allow quoting small sections as allowed by fair use law for purposes of commentary and criticism.