Random gene sets can predict breast cancer survival better than cancer-related signatures

And top-tier journals did not want to publish this surprising study.

Tumours are bundles of cells that grow and divide uncontrollably, and their genes are deployed in unusual ways. By analysing the genes from different tumour samples, scientists have tried to pin down the chaotic events that lead to cancer. They seem to be making headway. Dozens of papers have reported “gene expression signatures” that predict the risk of dying or surviving from cancer, and new ones come out every month.

These signatures purportedly hint at how healthy cells transform into tumours in the first place. If, for example, the genes in question are involved in wound healing, this tells you that the healing process is somehow involved in a tumour’s progression. These collections of genes reveal deeper truths about the disease they’re associated with.

This idea sounds reasonable, but David Venet from the Université Libre de Bruxelles has thrown a big spanner into the works. He has shown that completely random sets of genes can predict the odds of surviving breast cancer better than published signatures.

Venet found three signatures that are completely unconnected to cancer. Instead, these collections of genes were associated with laughing at jokes after lunch, with the experience of social defeat in mice, and with the positioning of skin cells. All of them were associated with breast cancer outcomes.

It got worse. Venet collected 47 breast cancer signatures from published papers and compared them to sets of random genes. The random sets were equally (or more) strongly associated with breast cancer outcomes than 60% of the published ones. In fact, you can randomly select a group of 100 genes or more, and be 90% sure of finding a statistically significant link with breast cancer. Venet wrote, “Investigators are bound to find an association however whimsical their marker is.”

Tubular Adenoma of Breast. Image from Flickr, by Ed Uthman

Venet’s study was described as a “must-read” by F1000 member Jinfeng Liu from Genentech Inc. The results may seem unbelievable, but there is a simple reason for them. The activities of thousands of genes across a breast cancer cell’s genome are related to how quickly that cell proliferates (grows and divides). And that is related to a patient’s prognosis.

As an analogy, you could find hundreds of things that correlate with a person’s wellbeing and lifespan: the number of Apple products they own, whether they have university degrees, how many cars they have, and so on. But this doesn’t mean that these things improve our health; instead, they reflect how wealthy we are, our lifestyle choices, and our access to good healthcare.

Gene signatures may be relatively useless at illuminating the causes of cancer, but the team stresses that they can still help doctors – after all, they’re still related to prognosis. Writing in The Scientist, the study’s lead author Vince Detours says, “Smoke does not drive fire, yet it is powerful indicator of when and where a fire is burning.”

Detours also aims a blow at scientific publishers who have let studies of genetic signatures proliferate uncontrollably. He wrote:

It took us four years and six rejections to get this work finally published in a computational biology journal – not the most efficient venue to reach the oncology community. Meanwhile, a steady stream of studies confounded by proliferation rates has appeared.

He added,

This has to be said; one can no longer stay silent about the rather limited self-correction capability of the top tier publishing system (Cell, Nature Genetics, PNAS, etc.), which promoted these studies in the first place.

Tags: , , .

Filed under Genomics & Genetics, Oncology, Pharmacology & Drug Discovery.


  1. DaveR says:

    As a doctoral student in a genetics program that is highly-reliant on funding that results from GWA studies, you would think that this type of article would have made a bigger splash than it did. The fact that it was more or less ignored reminds me of Thomas Kuhn and of how so clearly saw that the path to scientific truth is often blocked by entrenched dogmas which are rooted in nothing more-substantial than the personal reputations (and money) of the established scientific community.

  2. Ben says:

    make sense, when a patient gets cancer, it’s highly possible that the whole system is messed up.

  3. As a senior Professor of Genetics, I fully agree with Daver comment. I am trying to get out of the bunker of orthodoxy of Genetics (and science in general) to remove the veil and having a more honest picture of what is actually going on. Too many egos, economic and power interests to fall down the icons of an arrogant science.

  4. Nancy says:

    A 2010 paper has reported that many randomly selected genes (let say 30 genes) showed predictive power for cancer prognosis. However, the authors found that most of these “randomly selected markers” are not robust, meaning that they have no predictive power in other datasets (cohorts). Finally they focused on ‘cancer hallmark genes’ and obtained markers which showed robustness and high predicting accuracies (90%).

    2010, Identification of high-quality cancer prognostic markers and metastasis network modules.