Sandwalk: The Second Grapevine Genome Is Published

Friday, December 28, 2007

The Second Grapevine Genome Is Published

A second version of the grapevine genome was published at PLoS ONE last week (Velasco et al. 2007). As I began to collect information on that paper I learned that another genome sequence of grapevine had been published independently last September in Nature (Jaillon et al. 2007). Before discussing the PLoS ONE paper I decided to write up a report of that August genome sequence trying to not let the second sequence influence me [The Grapevine Genome].

This gives us an opportunity to evaluate the state of genome biology and genome evolution by comparing two competing analyses of the same genome. Keep in mind that the authors of the second paper were aware of the first study when they published in PLoS ONE so they had an opportunity to correct or modify their own work in light of the previous paper. Thus, the second group is able to point out "errors" in the first sequence and correct "errors" in their own sequence before publication.

Keep this in mind as you read the second paper because it often seems as though the first group to publish did a very sloppy job. What we don't see in the published work is the evidence of sloppiness in the second study that was fixed by referring to the earlier work.

Velasco et al. (2007) also sequenced the Pinot Noir cultivar of Vitis vinifera but unlike the previous study they used a heterogeneous strain. Recall that in the September paper the sequencing team used an inbred line in order to reduce the extreme heterogeneity seen in normal wine-making strains.

The genome size is 505 Mb (505 × 10⁶ bp). This is larger than the earlier published sequence (487 Mb). The extra DNA is almost entirely due to inclusion of ribosomal RNA clusters. Velasco et al. (2007) identified 29,585 genes—only slightly fewer than the 30,434 genes reported by Jaillon et al. (2007). Both teams used fairly strict criteria for identifying and annotating genes. The number of genes in the grapevine genome is comparable to the number in Arabidopsis (26,819) but fewer than the number in poplar (45,555) and rice (41,046). We can expect this number to fall as false positives are eliminated.

There are 719 tRNA genes (including 163 pseudogenes), 89 snRNA genes, and about 1500 copies of the 18S + 5.8S + 28S ribosomal RNA repeat. There are about 175 copies of the 5S RNA gene.

The authors report 166 copies of snoRNA and 143 copies of microRNAs based on known examples in other plant genomes.

Many plants exhibit very high heterogeneity between homologous chromosomes. Sister chromosomes in the Pinot Noir cultivar differ by as much as 11% in DNA sequence, including large gaps. This gives rise to regions that are hemizygous—they contain only one copy of a DNA sequence in a diploid genome. An example of this heterogeneity is shown below.

Two almost contiguous regions of chromosome 1 are depicted. The red regions are transposons of various kinds (c=Copia, a=Gypsy/athila, etc.). You can see that many of the deletions/insertions are at transposon positions indicating that much of the heterogeneity between sister chromosomes is due to the insertion and excision of active transposons. This level of transposon activity is rare in mammalian genomes but common in flowering plants.

In order to study the evolution of the grapevine genome, Velasco et al. (2007) compared the sequences of paralogous genes. These are genes that belong to a gene family that diverged from a common ancestor. By comparing the differences in sequence between any two genes it is possible to estimate the time of divergence. In order to avoid any bias due to selection, it is preferable to only compare nucleotide substitutions that do not change the amino acid sequence (synonymous substitutions, K_s).

The results are shown in the figure above. Most of the pairs of genes are very similar with 0 or 0.1 substitutions. These genes arose from a very recent duplication event. There is a secondary peak at about 0.9 substitutions indicating that a large number of genes were duplicated at some particular time in the past. If this is evidence of a genome-wide duplication event then these pairs of genes should be clustered in syntenic regions. (Large segments of the chromosome that have the same order of genes.)

The insert (E) shows the distribution of those pairs from syntenic regions. It looks like most of the pairs have accumulated similar numbers of substitutions suggesting strongly that there was a genome-wide duplication event.

It is well known that flowering plant genomes have undergone polyploidization and/or hybridization during their evolution from a common ancestor about 200-300 million years ago. In their September paper in Nature, Jaillon et al. (2007) proposed that the grapevine genome was closer to the common ancestor of dicotyledenous plants. Their analysis suggested that all dicots arose from a hexaploid ancestor (three haploid genome equivalents). Further duplications occurred in the lineages leading to poplar and Arabidospis, according to Jaillon et al. (2007) [The Grapevine Genome].

Velasco et al. (2007) disagree. In the second genome study they claim that the ancestral dicot genome was tetraploid (one round of duplication) and that a second round of duplication (2R) occurred in the grapevine lineage after it diverged from poplar and Arabidopsis (see below). Note that in this study Arabidopsis and poplar are assumed to more closely related to each other than they are to grapevine whereas in the previous study grapevine was clustered with poplar.

A third duplication (3R) took place independently in the lineages leading to Arabidopsis and polar, according to Velasco et al. (2007).

At present, it isn't possible to say who is correct. In fact, they might both be wrong. The significance of these two studies is that it gives us some idea of the level of confidence we can place on speculations about genome evolution. How you interpret your data depends very much on how you compare sequences both within a species and between species. The data does not seem to be good enough to make confident predictions as judged by the differing opinions of these two groups.

The take-home lesson is that we need to take studies of this sort with a large grain of salt. In most cases we won't be lucky enough to have competing labs to analyze the same data and point out differing interpretations.

Jaillon, O., Aury, J.M., Noel, B., Policriti, A., Clepet, C., Casagrande, A., Choisne, N., Aubourg, S., Vitulo, N., Jubin, C., Vezzi, A., Legeai, F., Hugueney, P., Dasilva, C., Horner, D., Mica, E., Jublot, D., Poulain, J., Bruyère, C., Billault, A., Segurens, B., Gouyvenoux, M., Ugarte, E., Cattonaro, F., Anthouard, V., Vico, V., Del Fabbro, C., Alaux, M., Di Gaspero, G., Dumas, V., Felice, N., Paillard, S., Juman, I., Moroldo, M., Scalabrin, S., Canaguier, A., Le Clainche, I., Malacrida, G., Durand, E., Pesole, G., Laucou, V., Chatelet, P., Merdinoglu, D., Delledonne, M., Pezzotti, M., Lecharny, A., Scarpelli, C., Artiguenave, F., Pè, M.E., Valle, G., Morgante, M., Caboche, M., Adam-Blondon, A.F., Weissenbach, J., Quétier, F., Wincker, P.; French-Italian Public Consortium for Grapevine Genome Characterization (2007) The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449:463-467. [PubMed] [Nature]

Velasco, R., Zharkikh, A., Troggio, M., Cartwright, D.A., Cestaro, A., Pruss, D., Pindo, M., Fitzgerald, L.M., Vezzulli, S., Reid, J., Malacarne, G., Iliev, D., Coppola, G., Wardell, B., Micheletti, D., Macalma, T., Facci, M., Mitchell, J.T., Perazzolli, M., Eldredge, G., Gatto, P., Oyzerski, R., Moretto, M., Gutin, N., Stefanini, M., Chen, Y., Segala, C., Davenport, C., Demattè, L., Mraz, A., Battilana, J., Stormo, K., Costa, F., Tao, Q., Si-Ammour, A., Harkins, T., Lackey, A., Perbost, C., Taillon, B., Stella, A., Solovyev, V., Fawcett, J.A., Sterck, L., Vandepoele, K., Grando, S.M., Toppo, S., Moser, C., Lanchbury, J., Bogden, R., Skolnick, M., Sgaramella, V., Bhatnagar, S.K., Fontana, P., Gutin, A., Van de Peer, Y., Salamini, F., Viola, R. (2007) A high quality draft consensus sequence of the genome of a heterozygous grapevine variety. PLoS ONE 2(12): e1326. doi:10.1371/journal.pone.0001326 [PubMed] [PLoS]

5 comments :

RPM said...: I mentioned in the previous comment thread that I had heard that the phylogeny used in the Nature paper was incorrect. From what I can recall, it's because the authors used blast scores between the genes from each species to determine the evolutionary relationships. This is not an appropriate phylogenetic technique.

I'm not sure what the "correct" phylogeny is, but the PLoS ONE paper may be working with a better tree. That would mean the polyploidization events inferred in this paper are more accurate.; Saturday, December 29, 2007 12:31:00 PM
Anonymous said...: Using a heterogeneous strain also facilitates shotgun assembly of repetitive sequences, so the PLoS group arguably had an easier task than the Nature group.; Saturday, December 29, 2007 12:45:00 PM
Larry Moran said...: RPM said,

I'm not sure what the "correct" phylogeny is, but the PLoS ONE paper may be working with a better tree. That would mean the polyploidization events inferred in this paper are more accurate.

Actually it makes no difference. The predictions of genome duplications in the Nature paper would look even better if they had used the phylogeny in the PLoS paper.; Saturday, December 29, 2007 2:37:00 PM
TheBrummell said...: Recall that in the September paper the sequencing team used an inbred line in order to reduce the extreme heterogeneity seen in normal wine-making strains.

This point piqued my interest. Are normal wine-making strains of grapes significantly more heterogeneous in their genes than other plants?

What is meant here by "hetergeneity"? My first thought was that this was heterozygosity, suggesting that populations of wine grapes had high heterozygosity at specific loci. However, it also occurs to me that this heterogeneity could be refering to between-individual variation, such that individuals in populations are quite different from each other in terms of alleles carried.

We've all heard of "hybrid vigour"; is high diversity (of alleles, proteins, or enzyme products like lipids) something that leads to great tasting wine?; Sunday, December 30, 2007 5:01:00 PM
Anonymous said...: I am just a student, so maybe I'll ask a very stupid question but I don't understant how the shotgun method can be applied using a heterogeneous strain??; Friday, May 16, 2008 8:58:00 AM

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)

Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)

Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Peter Atkins

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake. Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory. Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change. Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance. Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change. Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat. Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is True

I once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000

It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma

One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick

There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner

An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins

Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod

The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.

Richard Lewontin

More Recent Comments

Friday, December 28, 2007

The Second Grapevine Genome Is Published

5 comments :