In the late 1960's scientists started looking at the complexity of the genome itself. They soon discovered that large genomes were often composed of huge amounts of repetitive sequences. The amount of "unique sequence" DNA was only a few percent of the total DNA in these large genomes.1 This gave rise to the concept of junk DNA and the recognition that genome size was not a reliable indicator of the number of genes. That, plus the growing collection of genome size data, soon called into question the simplistic diagrams like the one shown here from an article by John Mattick in Scientific American (Mattick, 2004). (There are many things wrong with the diagram. Can you identify all of them? See What's wrong with this figure? at Genomicron).
Today we know that there isn't a direct correlation between genome size and complexity. Recent data, such as that from Ryan Gregory's website (right) reveals that the range of DNA sizes in many groups can vary over several orders of magnitude [Animal Genome Size Database]. Mammals don't have any more DNA in their genome than most flowering plants (angiosperms). Or even gymnosperms, for that matter.
Many of us have been teaching this basic fact for twenty years. The bottom line is ....
Anyone who states or implies that there is a significant correlation between total haploid genome size and species complexity is either ignorant or lying.It is notoriously difficult to define complexity. That's only one of the reasons why such claims are wrong. Ryan Gregory wants everyone to know that the figure showing genome sizes in different phylogenetic groups is not meant to imply a hierarchy of complexity from algae to mammals.
A recent paper by Taft et al. (2007) says complexity can be "broadly defined as the number and different types of cells, and the degree of cellular organization." We can quibble about the definition but there's nothing better that I know of. The real question is whether organism complexity is a useful scientific concept.
Here's the problem. Have some scientists already made up their minds that mammals in general, and humans in particular, are the most complex organisms? Do they construct a definition f complexity that's guaranteed to confer the title of "most complex" on humans? Or, is complexity a real scientific phenomenon that hasn't yet been defined satisfactorily?
I, for one, don't know whether humans are more complex than an owl, or an octopus, or an orchid. For all I know, humans may be less complex by many scientific measure of complexity. Plants can grow and thrive on nothing but water, some minerals, and sunlight. We humans can't even make all of our own amino acids. Does that make us less complex than plants? Certainly it does at the molecular level.
Back in the olden days, when everyone was sure that humans were at the top of the complexity tree, the lack of correlation between genome size and complexity was called the C-value paradox where "C" stands for the haploid genome size. The term was popularized by Benjamin Lewin in his molecular biology textbooks. In Genes II (1983) he wrote.
The C value paradox takes its name from our inability to account for the content of the genome in terms of known function. One puzzling feature is the existence of huge variations in C values between species whose apparent complexity does not vary correspondingly. An extraordinary range of C values is found in amphibians where the smallest genomes are just below 109bp while the largest are almost 1011. It is hard to believe that this could reflect a 100-fold variation in the number of genes needed to specify different amphibians.So, the paradox arises even if we don't know how to rank flowering plants and mammals of a complexity scale. It arises because there are so many examples of very similar species that have huge differences in the size of their genome. Onions, are another example—they are the reason why Ryan Gregory made up the Onion Test.
The onion test is a simple reality check for anyone who thinks they have come up with a universal function for non-coding DNA. Whatever your proposed function, ask yourself this question: Can I explain why an onion needs about five times more non-coding DNA for this function than a human?Imagine the following scenario. You are absolutely convinced that humans are the most complex species but total genome size doesn't reflect your conviction. The C-value paradox is a real paradox for you. Knowing that much of our genome is possibly junk DNA still leaves room for plenty of genes. You take comfort in the fact that under all that junky genome, humans still have way more genes than simple nematodes and flowering plants. You were one of those people who wanted there to be 100,000 genes in the human genome [Facts and Myths Concerning the Historical Estimates of the Number of Genes in the Human Genome].
But when the genomes of these species are published, it turns out that even this faint hope evaporates. Humans, Arabidopsis (wall cress, right), and nematodes all have about the same number of genes.
Oops. Now we have a G-value paradox, where "G" is the number of genes (Hahn and Wray, 2002). The only way out of this box—without abandoning your assumption about humans being the most complex animals—is to make up some stories about the function of so-called junk DNA. If it turns out that there are lots of hidden genes in that junk then maybe it will rescue your assumption. This is where we get some combination of the excuses listed in The Deflated Ego Problem.
On the other hand, maybe humans really aren't all that much more complex, in terms of number of genes, than wall cress. Maybe they should have the same number of genes. Maybe the other differences in genome size really are due to variable amounts of non-functional junk DNA.
1. Thirty years ago we had to teach undergraduates about DNA reassociation kinetics and Cot curves—the most difficult thing I've ever had to teach. I'm sure glad we don't have to do that today.
Hahn, M.W. and Wray, G.A. (2002) The g-value paradox. Evol. Dev. 4:73-75.
Mattick, J.S. (2004) The hidden genetic program of complex organisms. Sci Am. 291:60-67.
Taft, R.J., Pheasant, M. and Mattick, J.S. (2007) The relationship between non-protein-coding DNA and eukarotic complexity. BioEssays 29:288-200.
[Photo Credits: The first figure is taken from a course webite at the University of Miami (Molecular Genetics. The second figure is from Ryan Gregory's Animal Genome Size Database (Statistics).]