Once upon a time – guest blog by Sue Malcolm

SueMalcolmMaking data widely available doesn’t always mean that it can be widely understood. In this guest blog post, Sue Malcolm considers the vast information we have about different species’ genomes, compared to how very few people know how to interpret this information. Malcolm is Faculty Member for F1000Prime, and Emeritus Professor of Molecular Genetics at the Institute of Child Health, University College London, where she writes the blog Me & My Genes. Her guest post below is food for thought when considering what lies ahead for open science.

 

Tracing your family history is a very popular pastime, but even as recently as ten years ago I don’t think most of us would have guessed that by 2014 we would be tracing our ancestry back to Neanderthal man. Two breakthroughs have allowed this. Firstly, Next Generation Sequencing has made it almost routine to sequence and analyse an entire human genome and, secondly, great advances have been made in manipulating very ancient DNA. The group led by Svante Paabo in Leipzig has been largely responsible for this.

As a result of all these efforts, we now know that Neanderthal man interbred with our ancestors, mainly donating genes for fairer skin and hair which helped our ancestors adapt to colder more Northern climates as they migrated out of Africa. We know that 7,000 years ago a group of dark skinned, blue-eyed, lactose intolerant humans lived in caves in North Western Spain.   We can reveal historical events from centuries ago and chronicle the rise and fall of empires, invasions, migrations and the slave trade across Europe and Central Asia, and much more.

Yak

Once the human genome was completed, these techniques were applied to more and sometimes obscure species. Chinese scientists sequenced the yak and the panda genomes (to find out how the yaks adjusted to high altitudes and how pandas digest bamboo), the Malaysian government funded sequencing the date palm (so that better ways of selecting the most productive palms could be established which would avoid cutting down the dwindling rain forest unnecessarily), environmentalists sequenced the weird and rare Madagascan lemur, the aye aye (see image below), to develop a conservation plan and the Natural History Museum sequenced the tapeworm because, well they are very nasty parasites and knowing more about them may help treat them.

AyeayeAll these studies are very data heavy (a human has 3 billion base pairs and there is often up to 60 times coverage in the production of a genome) and they generate papers with vast Supplementary Sections, available online. The Supplementary material can easily run to 100 pages. Custom-built software solutions are needed for the analysis. For example, the program used to analyse historic population movements was called Globetrotter. As a result, the papers are of enormous general interest but almost impossible for anyone other than a very small group of peers to critique. This goes for F1000 members too, when they recommend articles about these studies on F1000Prime.

Even though most genomic sequences are freely available via online databases, interpretation of these vast amount of data is still limited to specialists in the field. It is a strange contrast that these deep secrets of the key events that have led to the development of modern man should be so very difficult for most of us to understand at the data level.

Image credits: Yak by Andrea Williams; Aye-aye by Frank Vassen.

previous post

Cytokines, stomachs and primordial soup

next post

Video tips: Get familiar with your F1000Prime homepage