Bringing static research figures to life

30 July, 2014

We have all heard people repeatedly pointing out that scholarly publishing has hardly changed since the first scientific journal article in 1665 and is still done largely within the constraints of a printing press. This is true on many levels including the humble figure. The latest article published in F1000Research by Björn Brembs and colleagues blasts the figure fully into the internet age.

In print, there was no space to publish the raw data or the code used to analyse them. You therefore had to try and summarise all that detail in a few fixed images; as each page adds cost in print, you were often limited to one figure providing a single view onto those data.

Why is it a problem?

With only one or two figures to choose from, authors are incentivised to pick the view of the data that best demonstrates their conclusions (rather than the other way around). As we all know, the ‘publish or perish’ culture incentivises novel/significant/positive conclusions over boring/small/negative ones. As a referee or reader, no-one can reuse the data, nor can they check your analysis to see if they agree with your conclusions.

In the internet age, of course this makes no sense. There are no longer limitations on page numbers or figure files, or even on hosting data and software (unless it is several terabytes). This is why F1000Research has always had a mandatory data policy for all our research articles and we also ask for full details of the software used to analyse the data.

So what’s new?

This paper by Brembs and colleagues properly brings the figure to where it should be: a living centrepiece of the article. This article has 2 proof-of-concept figures, both of which are important leaps forward in how a figure is presented. Figure 3 in fact doesn’t really exist. The authors submitted their data and their code and the figure is then generated ‘on the fly’ when you view the paper. Readers can change one of the parameters within the code, and this then changes the figure that is generated.

Figure 4 is similarly a figure generated ‘on the fly’ from raw data (currently 2 datasets) using a virtual machine. The authors are now inviting other Drosophila researchers to attempt to replicate this study and request a (free) virtual machine. For version 2 of this article, Figure 4 will become a ‘living’ figure – this means that other researchers can upload their data which will then add it to the existing data and, in real-time, will replot the figure showing each labs’ results against each other.

Of course this means that as an author, you can submit just data and the associated code to analyse that data, rather than having to worry about creating specific figures, which are often time-consuming and troublesome to get into a format that meets the specific requirements of each journal.

Why is all this important?

There has been much debate recently about the irreproducibility of published research. While there has been a push to encourage reproducibility, most journals don’t accept papers that are a straight reproduction of a previous piece of research: partially due to the lack of ‘impact’ of the work and partially a concern about where to draw the line with duplicate publication. The NIH recently held a workshop with 40 journal editors to try and agree a set of guidelines that would encourage the publication of articles that identify reproducibility problems with previously published studies. As we’ve seen with the STAP debacle, attempts at reproducibility are crucial (see the F1000Research paper from Kenneth Lee on his attempt to reproduce the subsequently retracted Nature papers).

The two figures in Brembs’ paper now allow you, the reader (and of course also the referee) to properly and quickly assess the data, and hence the associated conclusions. This is along the same vein as the data plotter tool we previously launched. Figure 4 is an even bigger evolution as attempts at replication from different labs can now be added onto the figure over time so you glean a greater understanding of the level of confidence in the paper’s conclusions. Of course, as with all our papers, all the data and software is also separately downloadable if you want to play with it in a different way or reuse it.

Figure 4 in particular really opens up a new world of opportunities, especially when combined with our versioned articles that can be regularly updated as often as the authors wish. You can now use articles as a central point for large collaborative groups where different pieces of the jigsaw can be added bit by bit in real-time to a centrally published figure. As in this article, they can be used to collate attempted replications in real-time. And they can transform how we all think about sharing and publishing science as a whole to move the data, software and analysis centre stage, as opposed to the narrative back-story and conclusions.

Watch out for Version 2 of this article and see the first living scientific figure in action!

Image from Alex on Flickr

topics: Open data, Open research

blog