Dreaming big: the new DREAM Challenges channel

13 October, 2015

We are very excited to announce our latest collaboration here at F1000Research, which is with the excellent team at Dream Challenges. DREAM, which stands for Dialogue for Reverse Engineering Assessment and Methods, has led the way in innovating crowd sourced collaborations between academia, industry and not for profits to answer important biomedical questions that have a significant impact human health. The challenges are driven by open, reproducible science and are facilitated by the Synapse platform powered by Sage Bionetworks. It is this combination which has enabled Dream Challengers (or DREAMers!) to work together in a collaborative way and form new communities to produce meaningful solutions.

A lot of aspects of the DREAM challenge are closely aligned with our own publishing platform, and so it was very natural for us to come together and discuss how we could potentially compliment one and other. The result, a dedicated DREAM Challenges channel, which will enable all the DREAMers solving a data driven question to publish peer-reviewed method articles based on the computational models they produce.

Very recently, the Prostate Cancer DREAM Challenge came to an end. The aim was to tackle key research questions about metastatic castration-resistant prostate cancer (mCRPC), an advanced form of the disease with poor outcomes. This particular challenge produced unprecedented levels of participation, with more than 550 registrants, comprising more than 60 teams. All the teams produced functional computational models, and despite the challenge winners being accustomed to publish their results, there is an abundance of solid reproducible research that would also benefit from an avenue to be published and shared with the wider community. All the teams therefore, will now be encouraged to publish their efforts in the DREAM challenges channel allowing them to receive credit for their contributions.

For a little more insight into DREAM Challenges and our new collaboration, I recently spoke with Jim Costello (University of Colorado Denver), Julio Saez-Rodriguez (RWTH Aachen University) and Gustavo Stolovitzky of IBM Research (who we shall collectively refer to as the DREAM TEAM!) about how and why we are working with each other.

Blog Interview:

F1000Research: DREAM challenges have really harnessed the power of crowd sourcing viable solutions to important biomedical problems. Can you explain a little more what a DREAM challenge is?

DREAM TEAM: A DREAM Challenge is a joint effort from many people where we try to shed light on fundamental questions about systems biology and translational medicine. It is set as a collaborative competition, whereby the DREAM organizers together with specific data providers pose questions and provide the data necessary to address the questions, such as what is the best drug to treat a tumour, or what is the regulatory network of a certain cell type. We then encourage anyone from around the world to participate in solving the Challenges and provide an unbiased, rigorous assessment of a team’s solution. Following the open Challenge period, we work together to characterize participants’ solutions to learn what methods did well, which did not, and if there are any innovative solutions to the proposed questions. Hence, a DREAM Challenge is designed and run by a community of researchers from a variety of organizations, fostering collaboration and building communities in the process.

F1000Research: With lots of “DREAMERS” entering a specific challenge how do you manage all the communication between the teams and store all the data and code that is being generated?

DREAM TEAM: We are very lucky to have partnered with Sage Bionetworks that has developed the Synapse platform and allows us to manage the entire Challenge process. Synapse can be used to host the data ensuring provenance, to interact with participants via forums, to run leaderboards where we follow the progress of the Challenge, and much more.

F1000Research: Open, transparent data sharing is at the core of DREAM challenges, something that is perfectly aligned with our ethos here at F1000Research. Why do you think that this is so important, not just for the challenges but for science itself?

DREAM TEAM: Transparency and openness is a requirement for reproducibility, which is at the core of the scientific method itself. Typically, science is not a series of Eureka moments where some brilliant researcher develops a new theory from scratch, but rather, developments in science most often build upon previous findings, insights, and data. About the need to share data, the idea is pretty straightforward. When it comes to health and disease, hoarding data means missing opportunities to have that data be helpful to patients. We fully believe that advances in addressing the questions posed in DREAM Challenges require the open sharing of data, methods, and insights so the community is in a position to take the results from a DREAM Challenge and advance the field even further. The principles of transparency and openness have been core principles in DREAM since the beginning, and are also at the heart of what Sage Bionetworks is about.

F1000Research: We have recently launched the DREAM Challenges Channel on our platform to complement all the great work that is undertaken by participating DREAMers. Can you tell us why you wanted to collaborate with us to provide all the DREAMers a route to publication?

DREAM TEAM: Over the course of a DREAM Challenge, hundreds of researchers from around the world work to solve a few focused sets of questions. This is a massive effort and represents thousands of person-hours worth of work. Teams try different approaches, often including innovative ideas, and this is an invaluable resource to the community. But the details of individual team methods are in the best case relegated to the supplementary materials of the overview paper describing the Challenge, or in the worst case never published. The DREAM Challenges Channel on F1000Research provides a fantastic framework to report all those ideas and results that participants developed over the course of a Challenge. The ability to submit updates are also excellent features for the methods, as often participants expand or refine the work after the Challenge has completed.

F1000Research: On the subject of winning a DREAM challenge, there must be some sort of criteria that determines which team deserves the accolade as a DREAM Challenge winner. You have recently published an article in the DREAM challenges channel called DREAMTools which evaluates the scoring metrics you use to calculate it; care to discuss how this works a little further?

DREAM TEAM: An essential step in a DREAM Challenge, or any other collaborative competition, is to assess performances of the different methods. For this we require a golden standard set of data that is hidden from all participants, unpublished, and represents the best set of currently available data to score individual approaches. The criteria is loosely defined as how close the prediction of a participant is to the gold standard, although there are many subtleties about how this is computed, since often the confidence on the different measured values is not the same. Over the years, we have developed many scoring methods for DREAM Challenges. The corresponding scoring code is typically designed on a per Challenge basis in an arbitrary language and syntax. In addition, templates and gold standards need to be retrieved manually. To facilitate the use of the Challenge resources by the scientific community, we have gathered DREAM scoring functions within a single software called DREAMTools that provides a single entry point to the DREAM scoring functions.

F1000Research: So we are on the tenth installment of the DREAM challenges (DREAM 10) which will focus on predicting the progression and survival of ALS patients – would you like to elaborate on some of the previous challenges and what you achieved?

DREAM TEAM: DREAM started with a focus on the question of reverse engineering gene networks, hence the acronym DREAM which stands for Dialogue for Reverse Engineering Assessment and Methods. The early 5-6 years of DREAM led to many important advances in network inference and the evaluation of the different methods (Marbach, et al. Nature Methods, 2012). Basically we proved through the DREAM Challenges that the most convenient strategy to answer a question is to aggregate different methods. We think that we have been pioneers in showing that for the inference of gene regulatory networks. We have then expanded our questions into other areas, and we have been able to help to improve the state of the art in disparate areas such as modeling transcription factor sequence specificity (Weirauch et al. Nat Biotech, 2013), prediction of outcome of ALS patients (Kueffner et al. Nat Biotech, 2015), or mutation calling methods (Boutros et al. Nature Methods 2015). We also work with a broad number of partners, from international consortia such as the International Cancer Genome Consortium (mutation calling), The Project Data Sphere (prediction of outcome of prostate cancer patients), and the pharmaceutical company AstraZeneca (to predict which drug combinations are effective on which cancer cells).

Besides the advancement in methods that were developed to address the challenges themselves, there are two other important aspects that DREAM Challenges contribute to the community. First, the data needed to address the Challenges is often hard to find, messy, and not unified. DREAM Challenges provide a very valuable resource to the community by annotating, cleaning, and making public very valuable data. Second, DREAM Challenges provide a benchmark of the methods in the field that is unbiased. Knowing how the community is performing on an important question is essential to know what needs to be done in the future, whether it is improved method development or identifying the type of data that is lacking to address the Challenge.

topics: Open data, Open research

blog