Teaching researchers to write better code

As research becomes increasingly data and computer driven more and more researchers start to develop tailor-made software for their research projects. However, only few of them are trained coders, and this is where the CodeRefinery project comes in.

Organized by the Nordic e-Infrastructure Collaboration NeIC, CodeRefinery teaches students and researchers, who write code used in research, how to use professional tools for efficiently developing and maintaining research software.

Reusable and reproductible software

According to Project Lead Radovan Bast, the aim is to train Nordic research groups and projects in state-of-the-art tools and practices in modern collaborative software engineering. The researchers are encouraged to migrate from ad-hoc in-house solutions to collaborative infrastructure encouraging code review and automated testing, and to build modular, reusable, maintainable, sustainable, reproducible, and robust software. As a part of this work, the Coderefinery project hosts the GitLab software collaboration and version control system and provides it to the Nordic community.

Handling ATLAS data

One of the users of Coderefinery’s GitLab service is Maiken Pedersen from Center for Informatics at the University of Oslo. She works with handling data and computing jobs from the ATLAS experiment at CERN (image above):

-“CERN in Geneva is the largest particle physics laboratory in the world. A special grid infrastructure, the LHC Computing Grid, is used to store data from the CERN experiments. One of the experiments connected to CERN is the ATLAS experiment, and a part of the ATLAS data is stored in Oslo as part of a distributed Nordic computing centre.

-My job is to provide access to this data to the ATLAS community of high-energy physicists. I’m involved in a computing software product called ARC, Advanced Resource Connector. Through ARC the computer centre is connected to the World-Wide LHC computing grid. ARC together with some other software components enables physicist to analyse ATLAS data from anywhere in the world using the grid. The job arrives at a computing centre holding the ATLAS data, maybe here in Oslo, and when it arrives the ARC software receives this job, sends it to be processed and delivers it back to the researcher.”

Improving version control

-I’m the release manager of the ARC software. We are involved with CodeRefinery because we’ve decided to move our version control system to a Gitlab repository hosted by CodeRefinery. ARC is a complex software, with many developers working on different components over many years, so version control is very important for us, obviously. Up until now we used SVN, which has served us well, but it is rather old and there are lots of nice new tools available for Git via products like GitLab or GitHub. We have had an on-going discussion for a few years now about going from SVN to Git, but were not able to make the decision, mainly because we would rather not host the service ourselves, but we would still like to have control of it by keeping it in the Nordics. When we learned that CodeRefinery were planning to host a Gitlab service we considered that a great opportunity to finally migrate from SVN to GIT.

– We switched in March and now we are planning a major update of the ARC software and that is all happening on Gitlab. A lot of the other software products we are working with are also using Git, and even at CERN the main experiment we are delivering for, the ATLAS experiment, have moved from SVN to Git. So it is a good thing to work with something that is well used and popular in the community. For us it is stepping into a more modern code development collaboration world, you could say.