More Recent Comments

Monday, March 03, 2014

Death of the genome paper

There was a time when sequencing a gene was just about all you needed to get a publication. Getting a high quality sequence of a typical protein-encoding gene (cDNA) took several years of work—almost sufficient for a Ph.D. thesis.

By the 1990's, that was routine and you needed much more to get a paper published. The genome era had begun and a good paper in a high impact journal required the complete sequence of an entire genome.

Today, you can't get a genome sequence published because it's so easy that undergraduates can do it.

David Smith of Western University (London, Ontario, Canada) laments the death of the genome paper while recognizing that sequencing has probably been abused (Smith, 2013). He makes some good points ...
One of the drawbacks of genome papers, however, is that they can create a mindset of sequence first, ask questions later. I once attended a Masters thesis defense where the external examiner asked the candidate why he sequenced the chloroplast genome of this particular species and what hypothesis was he trying to test. The student, looking startled, answered, "Because the genome hadn't been sequenced before and we didn't know what it looked like." After the defense, I overheard the examiner in the hallway venting to another professor. "We've created a culture of serial genomicists," she exclaimed. "Everyone's jumping from one genome sequence to the next, looking to score a major publication."

Regardless of this opinion, genome papers have provided much of the raw data that have shaped our view of genetics and evolution over the past 20 years. And they can also be a joy to read. Many of my favorite journal articles are genome papers. I remember, when I was a grad student in phycology, eagerly awaiting publication of the genome for Chlamydomonas—the superstar of green algae—and reading it incessantly once it was released, gleaning new insights each time through. There is something intimate and personal in learning about a species' genome. And similarly, if you are part of the team describing the genome, there is a feeling that you're giving the readers a first glimpse at an uncharted territory, with its unique landscape of genes, introns and intergenic regions.

But all of this may be coming to an end. Next generation DNA sequencing techniques have made it easy, fast and cheap to sequence genomes. Today, just about any scientist can walk out their laboratory doors, point to a living thing and say, “I will sequence you!” High-throughput technologies have flooded the academic market with genome papers. And the top journals have responded by only accepting papers describing the most novel, earth-shattering genomes. The less spectacular genomes, much like B-movies, go directly to video, or rather directly to GenBank. This sequencing-vs-publishing arms race has been going on for a long time.


Is it time to write the genome paper obituary? Maybe not quite yet. Every now and then they still claw their way into top journals. But the end is not far off, and when it does come, I'm sure that I speak for all of us genome geeks when I say, "Farewell, GP. It was fun while it lasted."
I still like to read genome papers but lately I've been put off by the lack of reliable information in most of those papers. One of things I'm interested in is the number genes, especially the number of unique genes. Unfortunately, the annotation usually relies on computer-generated gene predictions and those are notoriously unreliable.

Smith, D.R. (2013) Death of the genome paper. Frontiers in Genetics 4:1-2. [doi: 10.3389/fgene.2013.00072]


Jonathan Badger said...

Unfortunately, the annotation usually relies on computer-generated gene predictions and those are notoriously unreliable.

A large part of this is that that the multi-million dollar genome grants that were common a decade ago just don't exist any more. These days a genome is likely to be funded as part of a larger project to compare multiple strains of a given species at a small fraction of the funding a single genome used to get. Yes, sequencing has become much cheaper. But labor and lab work hasn't. The teams of manual gene annotators and lab techs testing computational results that used to cost a small fraction of a genome grant now would cost more than the entire grant. So these days genomes tend to be annotated entirely computationally with only brief human analysis.

Joe Felsenstein said...

One problem with genome papers was that there was pressure to put in conclusions drawn from an initial look. In the human genome paper, a set of conclusions based on hasty analysis were given. I'm told that every one of them ultimately proved to be wrong.

There is also a silly terminology surrounding them. People in my department, which is a Genome Sciences Department, the best in the world (and also the worst in the world) like to say "in 2004 we published the sequence of [name of organism]". But you can look at that paper all you want, and will never see the genome sequence there.

Anonymous said...

Joe F problem with genome papers was "that there was pressure to put in conclusions drawn from an initial look. In the human genome paper, a set of conclusions based on hasty analysis were given. I'm told that every one of them ultimately proved to be wrong..... " Was one right?
What are you Joe? What else are you going to sell for this ...?

Jonathan Badger said...

Yeah, at least Margaret Dayhoff literally published her "atlas" of proteins back in the day -- as actual dead-tree books containing the sequences themselves! To be fair, that's because most biologists didn't have computers -- she actually kept it on computer punch cards herself and actually invented the one-letter amino acid abbreviations to do so efficiently.

Joe Felsenstein said...

Good to remember Margaret Dayhoff's important work. I have two of those Atlases, that I bought way back then. I did also meet her once in about 1981 -- she seemed to be a very good person. Her former coworkers seem to have been very fond of her and eager to remember her.

The importance of her work has increasingly been recognized -- one of the two pioneers of the sequence databases (along with Walter Goad who founded Genbank), a major pioneer of the recognition of gene families, originator of the one-letter amino acid code, and with Richard Eck, first to infer a phylogeny from sequences (protein sequences, by parsimony, in 1966).

That's some list of achievements.

Joe Felsenstein said...

... not to forget her formulation of the PAM model, the first 20x20 amino acid substitution model.

It is sad that she died in 1983 before computational molecular biology really got recognition, and so she didn't get to see the belated recognition she is now getting. The Biophysical Society has a Margaret Oakley Dayhoff Award in her honor. There should be more.

Manoj Samanta said...

An old post -

Good Riddance – “Death of the Genome Paper”

Tom Mueller said...

I dunno – I think I found this genome sequence Science Article mind-boggling!

“If ctenophores evolved before sponges, the sponges probably lost some of their ancestors’ complexity. It’s also possible that sponges have a complexity that has yet to be defined.”

Wow – if this becomes consensus; Biology textbooks will require a radical rewrite!

This is my take –

Some taxonomists had already argued that Cnidarians are descendants of ancient bilateral coelomates and not the other way around. Biologists have known since the 1920s that Cnideria had a directive axis which gave them right and left-hand sides. Volker Schmidt went on to argue that non-radially organized hydrozoan larvae have an anterior concentration of sensory and ganglionic nerve elements, suggesting that a fundamental genetic toolkit for the establishment of bilateral and polarized anatomies was already present before the Cnidaria-Bilateria divergence. He went so far as to suggest that diploblastic status of adult Cniderians is derived and that true mesoderm can be even be detected during Cniderian embryogenesis. That last argument has proven particularly contentious.

I marvel meanwhile at the outlandish notion that each and every generation some cephalized and segmented deuterostome bilateran repeatedly generates yet another radial non-segmented animal possessing a nerve net that appears to lack any prima facie cephalization: a metazoan alternation of generations, as it were. This, of course, describes the life-cycle of an echinoderm.

I also marvel that both larva and adult possess the identical compliment of regulatory genes! Of course, evolution of brand new species can occur when such cycles are fixed in either the juvenile or the adult mode. And, clearly subtle processes contingent with regulatory gene timing can direct drastic changes in body plan.
Granted, closer examination of the echinoderm adult would call into question overly facile textbook generalizations of body plan. Should not the ganglia comprising a central nerve ring of a Sea Cucumber be considered the ANTERIOUR focal point of a central nervous system no differently than the more rudimentary ring of ganglia that comprises the “brain” of an Annelid? Platyhelminthes also appear to possess a similar repetitive body plan (in what appears to be a segmental) array, albeit along two axes not five as in echinoderms… but no textbook (I am aware of) interprets these observations along the terms I just describe.

As taxonomists pursue this debate it becomes it appears to me that standard textbooks are based on out-of-date curricula are clearly clinging to outdated generalizations. How is a High School Biology teacher to make sense of such a tangled mess?

My next question: after the dust settles, would it be possible to interpret sequencing data along the lines that Deuterostomes are similarly basal to Locotrophozoans and Ecdysozoans as Ctenophores are to Porifera and Eumetazoa: Locotrophozoans and Ecdysozoans being “more derived” in cladistics terms than Deuterostomes? I mean, is it conceivable that the last common ancestor to Deuterostomes, Locotrophozoans and Ecdysozoans had a dorsal nerve cord for example?

Joe Felsenstein said...

My next question ... would it be possible to interpret ... dorsal nerve chord, for example?

Yes, it would be possible to interpret that, and also possible to interpret it as the opposite, that the common ancestor of D, L, and E has a ventral nerve cord.

The reason is that there are quite a few biologists who would agree that D are "basal to" L and E, in that the tree is (D,(L,E)). Since D have dorsal nerve cords, and L and E share ventral nerve cords (and all the outgroups have no nerve cord), then we are left with an ambiguity: the ancestral state would be a nerve cord, but it might be either dorsal or ventral.

So yes, it is "possible to interpret" it that way ... or the opposite.

Tom Mueller said...

@ Joe

Thank you!!!

In other words the specious suggestion that Protostomes are basal to Deuterostomes is yet another red herring.

In a similar vein regarding "amphistome", "deuterostome" and “protostome” - it is NOT clear which is basal (or whether these terms in fact have meaning). Many annelids (presumed Protostomes) show classical protostome development where the blastopore becomes the mouth, whereas other annelids’ blastopores become the anus (according to deuterostome dogma) while other annelids behave like so-called amphistomes where the blastopore becomes both the mouth and anus; which could render moot the entire question which version is basal

It is not at all clear to me whether these terms as to "blatopore/mouth/anus destinies", in fact, have meaning obliging us instead to resort to evo-devo explanations invoking instead so-called molecular toolkits.

Not to mention the canard of radial vs. spiral cleavage... Brachiopods, although presumed Protostomes - in fact have radial cleavage not spiral.

One interpretation - radial cleavage is basal (whereas spiral is derived) along the same lies you describe above, leaving radial-cleavage-Brachiopods an earlier atavistic side branch of Lophotrochozoa

Joe Felsenstein said...

"... along the same lies you describe above ..."

I guess I'm terrible person.

Tom Mueller said...

@ Joe

OUCH!!!! mea culpa mea culpa mea maxima culpa!!!!!

Of course I meant to say "... along the same LINES you describe above ..."

My sincerest apologies!!! I wish there was an edit feature to correct typos after posting.

I notice that mregnor has just made an appearance

I wonder at his reaction to the suggestion that humankind does not represent the apex of evolution but rather the scala naturae culminates with cephalopods and arthropods who are better candidates by far than ourselves to represent the creator's image.

Joe Felsenstein said...

No offense. I knew it was a typo, a funny one.