Data sharing in translational neuroscience: a success story

As you know, at F1000Research we ask that all authors include all data with their publications. But why share data? What are the benefits? In one example, below, Michael R. Hunsaker shares his successful story of getting a publication based on data that were shared by another lab, after they first used his data. Do you have a story about re-using data? Let us know (for example, in the comments below) and you could be featured on our blog next.

Data Sharing? Why Not?

Once again scientists are having a conversation about the value of open data sharing. This time, it is because PLOS wrote in their blog that they were going to request the authors upload their data along with their manuscripts. This was requested since, although PLOS clearly states all data have to made available upon request, too many scientists were not honoring these requests for data.

I personally and professionally fall on the side of open data sharing, but I chose to take the stance that we need to strictly define what is meant by data and consider how we can prevent any negative impacts upon the authors for sharing these data. In my personal opinion, I think the best way to share data is for everyone to just share their data. But that also requires that those that want my data actually go to the trouble of asking me for it. I will then gladly send them any of my data, along with an explanation on how to use, and an olive branch to ask me for help if they need it. I do not feel that I necessarily should be an author on any resulting usage, but my impulse is to try and make sure my data are not misinterpreted.

A representative sample of these twitter conversations have been Storified by F1000Research (including my tweets from @mrhunsaker) here.

As part of these conversations, I mentioned that I have directly benefited from data sharing and have openly shared my data. Theoretical/computational researchers have asked for my data so they could use it to calibrate their models (results here). I have used open software and open data to retool and optimize analysis methods for my own research (MATLAB based methods and raw data downloadable here) and shared the resulting optimized analysis code in R format GitHub.

My Experience with Data Sharing

In this post I will provide a clear example of data sharing done right, or at least how I think data sharing should be done. In 2010, at the International Fragile X Foundation meeting I presented what was my dissertation work up to that point. These data suggested there was a behavioral phenotype in a mouse model of the fragile X premutation that phenocopies what we were seeing in people with the premutation. Specifically, I described a number of behavioral tasks that I had developed and how easy they were to run. My presentation received a nice reception and I went home thinking nothing more of it.

In 2012, at the next International Fragile X Foundation meeting I attended, I was approached by a researcher I did not know. She had watched my presentation a few years before and had spoken with her advisor about it. She asked me a number of questions about my experiments, including if I were using my experiments on the Fmr1 KO mouse. I told her I wanted to but I did not have access to that mouse, so it was fair game. She followed up asking me for a number of methodological details regarding my tasks and what to expect from control mice.

When I returned from this meeting, I sent her an email with all of my data, not just the control data. She did not ask for these data, but I felt it would be helpful for her to see how the exploration values changed over time during habituation as well as how the raw values were converted into ratios for analysis. I also sent slightly more detailed methods with a list of “be careful not to…” and “make sure you specifically do…” statements added in. Aside from a quick thank you by email I heard nothing more about this and had just assumed they had the experiments in queue and would get to them in time.

Fast forward to 2013 and I saw something very exciting in my RSS feed, the researchers I gave my data to had not replicated my experimental methods, but they also showed profound deficits in the Fmr1 KO mouse model. They further used these data as a baseline to test the effects of lithium and GSK-3 inhibitors for cognitive function in Fmr1 KO mice. More details regarding their work can be read on my Blog.

I was so excited I sent an email to congratulate them on their great work. I also was curious if the data they saw fit in a model I was developing. Specifically, I proposed that the Fmr1 KO mice would fall on a linear spectrum with the fragile X premutation mice I had published previously. As such, I asked if they would be willing to send me their data so I could plot it and run it through a couple of classification algorithms I developed as part of my dissertation work. They agreed and I received a spreadsheet (formatted the same way as the one I had sent them) the very next day.

The results of this interaction and sharing of data were threefold.

  1. It has now been shown rather definitively that the behavioral tasks I developed to test my mice work in other laboratories (in other words, my methods work in other labs!!!).
  2. These cognitive tasks are valid outcome measures in preclinical drug studies. Specifically, lithium and other more specific GSK-3 inhibitors improve performance in Fmr1 KO mice without affecting performance of control mice.
  3. My model has now been validated using a combination of my data as well as data shared by this other lab. The results of this analysis have been published here in F1000Research

Now this is a clear case of everything going right. I had no obligation to send them my data and they had no obligation to send me theirs. My vested interest in sharing my data with others was in making sure that they would be able to use my experimental methods. I felt that having the control data may help others to tweak the experiments as needed if their control mice did not learn as well as mine had, or at least to guide them to give me a call to troubleshoot. The other lab had no reason to help me out by sending me their data from their mouse, especially since preclinical drug testing was a part of their experiment. In our email correspondence, it was clear that they were just happy for my help and they thought nothing of sharing with me the way I had shared with them.

I do not know if the data I sent them was helpful. It may or may not have helped to guide their experimentation or the tasks may have worked first time for them. I do know that on my part it showed a high level of confidence in my methods that I sent the raw data to them. Had they not shared their data with me, I would still be waiting on someone to do these experiments. I do not have access to this model and I would probably still be harping on all of my collaborators to do the experiments so I could have the data. I do know that, because of their generous sharing, I was able to formally test a hypothesis I had longed to test. Fortunately for us all, the data fit the model and resulted in a publication.

An interesting wrinkle in this story is that the other lab declined an authorship on the resulting manuscript. When I asked why it was because they felt that when they published the papers with the data, they were done. They felt that the theory and testing it, even with their data, was my own work and they were happy with an acknowledgement. I cannot help but think part of their perspective was somehow influenced by the kindness I showed them back in 2012. Regardless, I am grateful to them for sharing, and I feel that together we have provided a wealth of data and theory that can help research into fragile X-associated disorders move forward.

In retrospect, I do not know if it would have been easier for everyone involved had I just published those data on figshare as a publicly available dataset in the first place. I think I still would have been contacted regarding the specifics of my experimental methods. As it was, I was cited very prominently in the two manuscripts the other lab published using my methods, and that was more than enough for me. I am always ecstatic to see my methods being adopted by other labs. It would have been nice to have a dataset they could have cited because one has to take on faith that their control mice performed similarly to mine (I can vouch they did because I have both datasets). What I do know is that had I been offered an authorship on the Fmr1 KO papers, I would have declined. My work was done when I published my methods and I feel it is my responsibility as corresponding author and a scientist to help out and advise anyone who wants to use my methods in how to best do so.

As a final note, for anyone who wants them, the behavioral data from my dissertation work are openly available for download on figshare and I am more than willing to have a conversation with anyone who wants to use them, just in case I can be helpful!

previous post

F1000Research and the Force11 Data Citation Principles

next post

More reviews of post-publication peer review