The sequence of the grapevine genome was reported in Nature last September (Jaillon et al. 2007). The 56 authors are all members of the French-Italian Public Consortium for Grapevine Genome Characterization [International Grape Genome Program].
The species is the dicotyledonous plant Vitis vinifera and the variety is cultivar Pinot Noir. In this case, the line was a special inbred variety that was about 93% homogeneous. It was necessary to use a selfed line of plants because most field varieties are very heterogeneous and this would have made it more difficult to assemble the sequence using the shotgun strategy.
The genome has 19 chromosomes amounting to 487 Mb of DNA (487 × 106 base pairs). This is comparable in size to the three other plant genomes that have been sequenced; rice, poplar, and Arabidopsis.
The published sequence is referred to as a "high-quality draft" by the authors. They report 30,434 protein-encoding genes and 600 tRNA genes. (Ribosomal RNA genes aren't included in the paper.) This is somewhat fewer genes than poplar (45,555) and rice (37,544) but more than Arabidopsis (27,029). However, this might be deceptive since the total number of identified genes tends to decrease as annotation proceeds and annotation of the Arabidopsis genome is much further along than annotation of the other genomes.
About 41% of the genome is composed of transposons—most of which are non-functional pseudogenes. Genes make up 46% of the genome (7% exons, 37% introns). This is a much lower percentage of junk DNA than typical mammalian genomes.
Vitis vinefera is a dicotyledonous plant like the poplar tree and the small flowering plant Arabidopsis. Rice is a monocot and monocots and dicots are thought to have diverged about 200 million years ago. Previous studies have suggested that grapevine should be more closely related to popular than to Arabidoposis and the genomic sequence confirms that relationship.
One of the most interesting problems in plant evolution is the tracking of various genome duplications that have occurred. Most plants show traces of recent polyploidization events and/or more ancient ones. This is most clearly seen when looking at paralogous genes in gene families and the evidence for large scale duplication comes from comparisons of large blocks of sequence. These syntenic regions (or paralogous regions) within a haploid genome are strong evidence of ancient duplications.
The figure below is taken directly from the Nature paper. It shows syntenic regions within the grapevine genome (left). Each colored region corresponds to a stretch of paralogous (homologous) genes. As you can see, chromosome 1, 14, and 17 each contain a large block of similar sequence (light green). Chromosomes 10, 12, and 19 have a different syntenic region (red).
The evidence suggests an ancient hexaploidization in the lineage leading to grapevine. When the syntenic regions of poplar (middle) and Arabidoposis (right) are mapped, you can see that the patterns get much more complicated and the regions become scrambled. The simplest explanation is that the grapevine genome is close to the ancestral genome of all dicots and the poplar and Arabidopsis genomes have undergone additional duplications accompanied by gene loss. The rice genome shows no evidence of the ancient tripling of the genome in dicots.
The phylogenetic tree looks like this—where stars represent duplication events. There has been at least one, and possibly two, duplications in the lineage leading to rice. It will be interesting to see if other monocot genomes show evidence of these duplications or whether they are specific to rice.
It appears that there have been two polyploidy events in the lineage leading to Arabidopsis from the time it diverged from the other two dicots. I don't think anyone has a good explanation for why genome duplications are so frequent in the evolution of vascular plants.1
Note that the gene duplications give rise to larger gene families in flowering plants but what this means is that there are fewer distinct genes in plant genomes compared to mammalian genomes. Of course, plants have a number of metabolic pathways that aren't found in animals and some of the genes for these pathways are specifically amplified in the grapevine genome.
For example, there are more genes for stilbene synthases in grapevine than in poplar or Aribidopsis Stilbane synthases are essential enzymes in the resveratrol pathway. Resveratrol is the wine chemical associated with presumed health benefits coming from wine consumption.
The grapevine genome also has extra copies of the gene for terpene synthases. These are responsible for synthesis of resins, oils, and aromas that give wine its unique taste. These genes are probably the result of selected breeding over the course of several thousand years.
UPDATE: Read about The Second Grapevine Genome Is Published.
1. Perhaps the Intelligent Design Creationists can explain this using their "scientific" theories.
Jaillon, O., Aury, J.M., Noel, B., Policriti, A., Clepet, C., Casagrande, A., Choisne, N., Aubourg, S., Vitulo, N., Jubin, C., Vezzi, A., Legeai, F., Hugueney, P., Dasilva, C., Horner, D., Mica, E., Jublot, D., Poulain, J., Bruyère, C., Billault, A., Segurens, B., Gouyvenoux, M., Ugarte, E., Cattonaro, F., Anthouard, V., Vico, V., Del Fabbro, C., Alaux, M., Di Gaspero, G., Dumas, V., Felice, N., Paillard, S., Juman, I., Moroldo, M., Scalabrin, S., Canaguier, A., Le Clainche, I., Malacrida, G., Durand, E., Pesole, G., Laucou, V., Chatelet, P., Merdinoglu, D., Delledonne, M., Pezzotti, M., Lecharny, A., Scarpelli, C., Artiguenave, F., Pè, M.E., Valle, G., Morgante, M., Caboche, M., Adam-Blondon, A.F., Weissenbach, J., Quétier, F., Wincker, P.; French-Italian Public Consortium for Grapevine Genome Characterization (2007) The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449:463-467. [PubMed] [Nature]
[Photo Credit: Nature]