IT WAS a discovery that threatened to overturn everything we thought about what makes us human. At the dawn of the new millennium, two rival teams were vying to be the first to sequence the human genome. Their findings, published in February 2001, made headlines around the world. Back-of-the-envelope calculations had suggested that to account for the sheer complexity of human biology, our genome should contain roughly 100,000 genes. The estimate was wildly off. Both groups put the actual figure at around 30,000. We now think it is even fewer – just 20,000 or so.There's more to the story but I'll leave that to another post. Right now I want to focus on the persistent, and false, meme about the "shocking discovery." Here are two previous posts on the subject.
"It was a massive shock," says geneticist John Mattick. "That number is tiny. It’s effectively the same as a microscopic worm that has just 1000 cells."
False History and the Number of Genes 2010
Facts and Myths Concerning the Historical Estimates of the Number of Genes in the Human Genome
It is simply not true that knowledgeable experts in the field were surprised by the number of genes in the draft sequence of the human genome published in 2001. Most of these experts were well aware of previous published work in biochemistry and molecular biology. They knew about the genetic load arguments dating back to the 1940s.
The best estimates of the number of genes in the human genome had long been incorporated into the textbooks. Benjamin Lewin, chief editor of Cell was one of these experts. In his very popular textbook (Genes II) he concluded, in 1983, that there were 30,000 - 40,000 genes. Molecular Biology of the Cell was another popular textbook; its authors (Alberts et al. 1983) estimated that the human genome contained 30,000 genes.
The 2001 draft sequence estimated 30,000 genes. Who was shocked?
The experts predicted about 30,000 genes and that's exactly what was discovered. The most recent updates of the human genome reference sequence have about 25,000 genes or which 20,000 are protein-coding genes. So the facts of the story are wrong if you go by what the knowledgeable experts were saying before the human genome sequence was published.
... those ignorant of history are not condemned to repeat it; they are merely destined to be confused.
Stephen Jay Gould
Ontogeny and Phylogeny (1977)It's true that there were many non-experts who had not studied the evidence back in 2001. They may have fallen for back-of-the-envelope guesses by other non-experts. But if you are going to make a point about the state of knowledge you don't quote the non-experts.
So, the facts are wrong. The experts were not shocked by the number of genes in the human genome. If that's true (it is) then what are we to make of the opening sentence of the New Scientist article? ...
IT WAS a discovery that threatened to overturn everything we thought about what makes us human.That's false as well. Several decades of work before 2001 had shown us that the differences between species were not due to differences in the number of genes but to differences in how and when they were regulated. That was the state of knowledge back then and it's still the state of knowledge today. Nothing about the human genome sequence "threatened" anything we thought about what makes us human. Developmental biologists had essentially solved the problem in the 1980s.
The Deflated Ego Problem. They believe that humans are much more complex than other species so they were expecting us to have lots more genes. They were shocked when they learned that humans have about the same number of genes as other animals.
Whenever you read an article that begins with this false meme you can be certain that it's going to describe some solution to the "problem." There are seven common rationales uses to explain away the "shocking" discovery that we don't have many more genes than a fruit fly [Vertebrate Complexity Is Explained by the Evolution of Long-Range Interactions that Regulate Transcription?]. You know the article will use at least one of these arguments to cope with their deflated egos. Here's the list copied from a previous post.
1. Alternative Splicing: We may not have many more genes than a fruit fly but our genes can be rearranged in many different ways and this accounts for why we are much more complex. We have only 25,000 genes but through the magic of alternative splicing we can make 100,000 different proteins. That makes us almost ten times more complex than a fruit fly. (Assuming they don't do alternative splicing.)
2. Small RNAs: Scientists have miscalculated the number of genes by focusing only on protein encoding genes. Our genome actually contains tens of thousands of genes for small regulatory RNAs. These small RNA molecules combine in very complex ways to control the expression of the more traditional genes. This extra layer of complexity, not found in simple organisms, is what explains the Deflated Ego Problem.
3. Pseudogenes: The human genome contains thousands of apparently inactive genes called pseudogenes. Many of these genes are not extinct genes, as is commonly believed. Instead, they are genes-in-waiting. The complexity of humans is explained by invoking ways of tapping into this reserve to create new genes very quickly.
4. Transposons: The human genome is full of transposons but most scientists ignore them and don't count them in the number of genes. However, transposons are constantly jumping around in the genome and when they land next to a gene they can change it or cause it to be expressed differently. This vast pool of transposons makes our genome much more complicated than that of the simple species. This genome complexity is what's responsible for making humans more complex.
5. Regulatory Sequences: The human genome is huge compared to those of the simple species. All this extra DNA is due to increases in the number of regulatory sequences that control gene expression. We don't have many more protein-encoding regions but we have a much more complex system of regulating the expression of proteins. Thus, the fact that we are more complex than a fruit fly is not due to more genes but to more complex systems of regulation.
6. The Unspecified Anti-Junk Argument: We don't know exactly how to explain the Deflated Ego Problem but it must have something to do with so-called "junk" DNA. There's more and more evidence that junk DNA has a function. It's almost certain that there's something hidden in the extra-genic DNA that will explain our complexity. We'll find it eventually.
7. Post-translational Modification: Proteins can be extensively modified in various ways after they are synthesized. The modifications, such as phosphorylation, glycosylation, editing, etc., give rise to variants with different functions. In this way, the 25,000 primary protein products can actually be modified to make a set of enzymes with several hundred thousand different functions. That explains why we are so much more complicated than worms even though we have similar numbers of genes.