CODECHECK: enabling reproducible computational researchF1000 Blogs

CODECHECK: An open science initiative for reproducible computational research

27 September, 2021

A codechecker at a computer going through the codecheck workflow

What if the computations underlying research articles were checked as part of the peer review process? Today, we hear from Daniel Nüst and Stephen J. Eglen, the authors of a new Method Article that proposes a novel workflow to integrate the evaluation of computational reproducibility into peer review. We discuss the core principles of CODECHECK, incentives for ‘codecheckers’, and the potential impact for open science.

Please tell us a little bit about yourselves and your field of research.

Daniel is a researcher at the Spatio-temporal Modelling Lab, based at the Institute for Geoinformatics (IFGI) at the University of Münster. Daniel is just completing a PhD in the context of the DFG project Opening Reproducible Research where he develops tools for creation and execution of research compendia in geography and geosciences.

Stephen is a Professor of Computational Neuroscience in the Department of Applied Mathematics and Theoretical Physics, University of Cambridge. Stephen’s primary research interests are in understanding the development of the nervous system using computational approaches.

We share a mutual interest in open research, and in particular ways to encourage reproducible computational research.

What inspired you to create CODECHECK? What is it exactly?

We joined forces to create CODECHECK after learning that we had independently submitted similar research projects to an open research grant call from Wellcome. We jointly created CODECHECK after gaining a science mini-grant from Mozilla.

CODECHECK is our system for checking that code and data provided alongside a research article can regenerate the figures and tables contained in that article. A codechecker independently follows the authors’ instructions for reproducing these artifacts, and if successful, writes a certificate to summarize the reproduction. These certificates can then be freely shared alongside the research article to demonstrate that the results are reproducible.

The CODECHECK example workflow implementation [1]

How did community feedback shape the workflow and principles which uphold CODECHECK?

During development, we talked to numerous parties, including publishers, potential authors, and reviewers. This was very helpful to refine and reshape our ideas during development. Our interpretation of the ideals that underpin the Open Science movement are certainly also reflected in the principles, with a strong emphasis on transparency, giving proper credit, and open collaborations for scientific progress.

One of CODECHECK’s core principles is “communication between humans is key”—why is this so important to the process? Could combining CODECHECK with Open Peer review be beneficial in this regard?

We regard the CODECHECK as a form of peer review; we opted however to allow the codechecker and author to talk directly to each other, rather than via a 3rd party. For example, a journal editor or through some technical system that tries to uphold anonymity. We felt this was important to help make the process efficient and constructive, because it is very hard to provide a complete set of instructions for some unknown party to reproduce one’s workflow. This is one of the reasons for the whole reproducible research “crisis”, and connecting people is much less time consuming and more educational than creating documentation for an unknown user. As codecheckers are not evaluating the research for some notion of ‘quality’ or ‘correctness’, we think that this process should be cooperative. And certainly, yes, codecheck could be formally seen as an open peer review process, as one possible implementation of our principles.

What are the incentives for someone to check the code?

This is a great question. Codecheckers have volunteered for this process for many reasons, as we outline in our discussion, including to gain exposure to new results, making professional contacts with other groups, and supporting new open research initiatives. Credit is currently given via having the certificate available as a citable object, which we hope will be recognized as a valuable service, along with other reviewing activities. In several cases the final paper also linked to the CODECHECK certificate, e.g., in the acknowledgements section or where the data and software availability is documented. We are grateful to our collaborating editors and publishers that they have made this possible.

Why does this represent a good opportunity for early career researchers?

We can think of several reasons why ECRs might benefit.

They have time and interest in experimenting with new ways of working.
They are already familiar with new technologies that underly computation. (Stephen has learnt much from Daniel about many of these new technologies.)
As mentioned above, being a codechecker may naturally lead to discussions with authors in research labs with mutual interests, leading to new collaborations,. Such connections are especially crucial but also hard to build early in your career.
They make (first) experiences being part of a peer review process, get to know the journal editors, and, over time, may serve as regular scientific peer reviewers.

How can CODECHECK contribute to increased trust in research outputs?

It is still commonplace today to read in papers statements such as ‘data available upon request’. By contrast, if you see a codecheck certificate, you can guarantee that the data (and code) are already available, and that the results in the paper have been reproduced by someone else. We think this should send a positive signal that the author has annotated their research outputs in such a way that others can reuse them appropriately.

What impact do you hope the CODECHECK initiative will have?

We think it is just one of a number of complementary approaches to tackling the ‘reproducibility crisis’. We think that our approach is suitable for large scale work, where creating interactive documents (such as eLife’s Executable Research Article) cannot be used to the computational demands. Furthermore, the CODECHECK itself is an open initiative, so different communities may implement codechecking in different ways. The common principles and shared understanding may help to spread the practice further, even across disciplines.

CODECHECK has already been adopted by one Open Science officer at a Dutch university, so papers are checked following our guidelines before submission, which is an unintended but very positive example. We hope that it will be adopted by journals, see next…

How can the scientific community and the publishing industry work together to overcome the barriers to implementing CODECHECK at scale?

We think that our current work demonstrates that CODECHECK is one sensible approach to demonstrating reproducibility of research articles. However, to develop it at scale, we need several resources:

Compute resource to allow codecheckers to re-run compute-intensive jobs. Currently we have been making use of compute resource available locally to codecheckers.
A CODECHECK editor, either full or part-time to help coordinate activities.
More volunteer codecheckers are always welcome.
Open-minded publishers and editors who want to embrace reproducibility and openness based on the CODECHECK principles.

Collaboration between researchers and publishing industry, perhaps via jointly grants, would help greatly in this regard.

Want to learn more about CODECHECK? Read the full Method Article, CODECHECK: an Open Science initiative for the independent execution of computations underlying research articles during peer review to improve reproducibility, on F1000Research.

You can stay up to date with the latest CODECHECKs and find ways to get involved at codecheck.org.uk. You can also find @cdchck, Daniel, and Stephen on Twitter, and use #CODECHECK to share your thoughts.

References

[1] Nüst D and Eglen SJ. CODECHECK: an Open Science initiative for the independent execution of computations underlying research articles during peer review to improve reproducibility [version 2; peer review: 2 approved]. F1000Research 2021, 10:253 (https://doi.org/10.12688/f1000research.51738.2)

User comments must be in English, comprehensible and relevant to the post under discussion. We reserve the right to remove any comments that we consider to be inappropriate, offensive or otherwise in breach of the User Comment Terms and Conditions. Commenters must not use a comment for personal attacks.

Click here to post comment and indicate that you accept the Commenting Terms and Conditions.

blog

CODECHECK: An open science initiative for reproducible computational research

Please tell us a little bit about yourselves and your field of research.

What inspired you to create CODECHECK? What is it exactly?

How did community feedback shape the workflow and principles which uphold CODECHECK?

One of CODECHECK’s core principles is “communication between humans is key”—why is this so important to the process? Could combining CODECHECK with Open Peer review be beneficial in this regard?

What are the incentives for someone to check the code?

Why does this represent a good opportunity for early career researchers?

How can CODECHECK contribute to increased trust in research outputs?

What impact do you hope the CODECHECK initiative will have?

How can the scientific community and the publishing industry work together to overcome the barriers to implementing CODECHECK at scale?

Identity and integrity in peer review: the role of co-reviewing

Predatory journals and open research: a multi-stakeholder approach to tackling predatory publishing

Leave a Reply Cancel reply

Stephen Eglen and Daniel Nüst

Follow F1000

Topics

Popular posts

Our Blogs

blog

CODECHECK: An open science initiative for reproducible computational research

Please tell us a little bit about yourselves and your field of research.

What inspired you to create CODECHECK? What is it exactly?

How did community feedback shape the workflow and principles which uphold CODECHECK?

One of CODECHECK’s core principles is “communication between humans is key”—why is this so important to the process? Could combining CODECHECK with Open Peer review be beneficial in this regard?

What are the incentives for someone to check the code?

Why does this represent a good opportunity for early career researchers?

How can CODECHECK contribute to increased trust in research outputs?

What impact do you hope the CODECHECK initiative will have?

How can the scientific community and the publishing industry work together to overcome the barriers to implementing CODECHECK at scale?

Share this post

Identity and integrity in peer review: the role of co-reviewing

Predatory journals and open research: a multi-stakeholder approach to tackling predatory publishing

Leave a Reply Cancel reply

Stephen Eglen and Daniel Nüst

Follow F1000

Topics

Popular posts

Our Blogs