Thursday, May 24, 2007

The Deflated Ego Problem

"How humans get away with having a small genome"

Believe it or not, that's actually the subtitle of a short article in this month's issue of SEED (June, 2007). Who knew that humans have a small genome?

The author, Yohannes Edemariam, is a frequent contributor to SEED. He lives here in Toronto. Edemariam begins with the usual mythology designed to make you think there's a problem with the human genome [see Facts and Myths Concerning the Historical Estimates of the Number of Genes in the Human Genome]. This "problem" cries out for an explanation ...
Given our complexity—our capabilities for abstract thought, language, the building of civilizations—biologists were surprised at the relatively small number of genes we possess when they first began studying the human genome. It has since been become clear that our 20,000 to 25,000 genes can be manipulated by processes that statistically enhance the variety of ways in which each gene becomes manifest in our physical makeup.
This is typical of the rhetoric that pervades the popular science literature and, more importantly, the real scientific literature. The scientific evidence shows that our genome has about 25,000 genes and that's not much more than nematode worms or fruit flies. What this tells us is the same message that developmental biologists have been shouting for 35 years—small changes can have big effects. Clearly, some people haven't been listening.

The human chauvinists are disappointed that our genome isn't as complex as our brains and behavior suggest (to them). They expected to see tangible evidence that humans were at the top of the heap. I call this "The Deflated Ego Problem." The question before us is whether this is a real scientific problem or whether it stems from an incorrect understanding of evolution and development.

Having barely survived a major blow to their ego when the human genome turned out to have fewer than 30,000 genes, the deflated ones have fought back with various schemes to explain the "paradox." What they look for is some special mechanism that we humans possess in order to get a bigger bang for our buck. In other words, they're looking for their missing complexity in other places.

Ironically, the chauvinists don't realize that their "problem" can only be solved by discovering hithertofore unknown mechanisms that are confined to humans, or possibly mammals. The reason is obvious. If the mechanism is universal then fruit flies and worms have it as well and we can't use the new-found genome complexity to rationalize why we have so few genes compared to them. After all, the goal here is to explain why we only have a few thousand genes more than those "simple," "primitive," species and the explanation won't work if we all have the same complexity-generating mechanisms. I say "ironically" because many of the special mechanisms being proposed were first discovered in these "primitive" species. Now they're being used to solve the Deflated Ego Problem.

So, what are these magical complexity-generators that "statistically enhance the variety of ways in which each gene becomes manifest ...?" Are they going to solve the Deflated Ego Problem?

I'm not going to tell you which one is being promoted in the SEED article. You'll have to buy the magazine—which I highly recommend in spite of its flaws—to find out the answer. Here's the latest list of the sorts of things that may salvage your ego if it has been deflated.
1. Alternative Splicing: We may not have many more genes than a fruit fly but our genes can be rearranged in many different ways and this accounts for why we are much more complex. We have only 25,000 genes but through the magic of alternative splicing we can make 100,000 different proteins. That makes us almost ten times more complex than a fruit fly. (Assuming they don't do alternative splicing.)
2. Small RNAs: Scientists have miscalculated the number of genes by focusing only on protein encoding genes. Our genome actually contains tens of thousands of genes for small regulatory RNAs. These small RNA molecules combine in very complex ways to control the expression of the more traditional genes. This extra layer of complexity, not found in simple organisms, is what explains the Deflated Ego Problem.
3. Pseudogenes: The human genome contains thousands of apparently inactive genes called pseudogenes. Many of these genes are not extinct genes, as is commonly believed. Instead, they are genes-in-waiting. The complexity of humans is explained by invoking ways of tapping into this reserve to create new genes very quickly.
4. Transposons: The human genome is full of transposons but most scientists ignore them and don't count them in the number of genes. However, transposons are constantly jumping around in the genome and when they land next to a gene they can change it or cause it to be expressed differently. This vast pool of transposons makes our genome much more complicated than that of the simple species. This genome complexity is what's responsible for making humans more complex.
5. Regulatory Sequences: The human genome is huge compared to those of the simple species. All this extra DNA is due to increases in the number of regulatory sequences that control gene expression. We don't have many more protein-encoding regions but we have a much more complex system of regulating the expression of proteins. Thus, the fact that we are more complex than a fruit fly is not due to more genes but to more complex systems of regulation.
6. The Unspecified Anti-Junk Argument: We don't know exactly how to explain the Deflated Ego Problem but it must have something to do with so-called "junk" DNA. There's more and more evidence that junk DNA has a function. It's almost certain that there's something hidden in the extra-genic DNA that will explain our complexity. We'll find it eventually.
7. Post-translational Modification: Proteins can be extensively modified in various ways after they are synthesized. The modifications, such as phosphorylation, glycosylation, editing, etc., give rise to variants with different functions. In this way, the 25,000 primary protein products can actually be modified to make a set of enzymes with several hundred thousand different functions. That explains why we are so much more complicated than worms even though we have similar numbers of genes.
I don't think any of these explanations are valid because I don't think there's a problem that need explaining in the first place. I wish scientists and science writers would stop pretending that the Deflated Ego Problem is a real scientific problem and I wish they'd stop promoting their favorite, logically flawed, arguments to defend it.

Since that ain't going to happen, I'd like to offer a bit of advice designed to spare us from rhetorical overload. Here's a little template that all science writers can use next time they're tempted to write about this "problem."
(I/we/the authors) believe that the Deflated Ego Problem is a real scientific problem. (I/we/the authors) propose that explanation number (1/2/3/4/5/6/7) will account for the fact that we have too few genes.

24 comments:

  1. Small RNAs are found in simpler organisms. They serve an important immune function against RNA viruses in plants, for instance, and are found throughout animals. Pseudogenes and transposons don't help with development. At most they provide raw material for evolution, but that doesn't explain human complexity now (assuming they are right and it needs explaining). Alternative splicing, of course, is a common feature of all eukaryotes. I am pretty sure post-translational modification is found in all organisms period. So ignoring pseudogenes and transposons, these people have to show that we use one or more of these mechanisms significantly more than other eukaryotes, and that these other eukaryotes don't use a different combination of mechanisms to get the same overall effect.

    ReplyDelete
  2. Jawed vertebrates have a handy adaptive immune system, so need fewer immune defense related genes, I've heard.

    Pete

    ReplyDelete
  3. The scientific evidence shows that our genome has about 25,000 genes and that's not much more than nematode worms or fruit flies.
    Latest estimate is 18,000-18,500 coding genes.

    As a cell biologist, I view genes as tools to make different cell types. Vertebrates all have the same types of cells and thus it is no surprise that most genes are shared by all vertebrates (and thus all vertebrates have similar gene counts). Invertebrates such as worms and flies have almost the same number of cell types that vertebrates have and thus it is no surprise that invertebrates have almost (if not the same) number of genes that we have.

    If one were to argue that we humans are "more complex" then other animals, one would have to conceed that this complexity is mostly neuronal. This "increased complexity" is due in part due to an increase in neuronal number (although many animals such as elephants have many more nurons than humans) and an increase in how these neurons are conected. How to increase the "complexity" of our neurite connections? Well we may have more neuronal guidance cues through an increase in those types of genes or through processes such as alternative splicing. But probably the biggest difference is that our genetic program for specifying when and how any particular gene is turned on. Sticking with a computer code analogy, our genome may not have more code or more functions but the code is such that the final product works more smoothly. So my guess is that our "increased complexity" is due to how our genes are turned on by DNA (and RNA) regulatory elements. These regulatory sequences may be "more complicated" or "better tweeked" in humans compared to other organisms. But most of our coding genes have roles in cellular functions and specifying different cell types, not neuronal guidance or neuronal conectivity.

    ReplyDelete
  4. Assuming fruit flies don't have artificial splicing isn't safe... Dscam, a neural cell adhesion molecule, has more than 5,000 isoforms in Drosophila, and I imagine this isn't the only example.

    ReplyDelete
  5. Even more plainly, we and chimpanzees have essentially the same number of genes and almost certainly all the same regulatory mechanisms -- we did not evolve any magical, revolutionary new processes. None of the items 1-7 can account for that.

    Basically, evolution just selected for a few tweaks in a small number of alleles/reg sequences that made forebrains balloon up. It's no more special than the changes that made a giraffe's neck grow longer, or a stalk-eyed fly to build those bizarre heads. We're magnified mice or exotic fish, and our particular attributes are just an expression of ordinary biological diversity.

    ReplyDelete
  6. Nah, apalazzo, I'd argue the other way. Our neuronal complexity is not a product of fancier genetic control at all -- it's an aspect of plasticity. Human neurons are less precisely specified than fly neurons. What they have is a general program for what a neuron should do, refined by specification to broad regions like cortical layer or zone, and they play out that role in the context of their environment. Start with relatively simple rules, expand the playground, and you see more complexity emerge.

    ReplyDelete
  7. Sorry Larry, my comment is barely legible. I cleaned it up a bit and posted it on my blog:

    http://scienceblogs.com/transcript/2007/05/a_little_note_on_complexity_in.php

    PZ,

    I'm not saying that our neurons are more precisely controlled but that the genetic algorithms are better. Whether this results in a higher degree of plasticity, better organization, or a combination I remain agnostic. Mental traits can be selected for in dogs and other animals so in the end brain function will be a result of different genetic algorithms.

    ReplyDelete
  8. In principle, as few as 40 or so regulatory proteins would suffice to uniquely specify every cell in a human body. So the fact the we have 30,000 genes (give or take) doesn't mean that there is a paradoxical shortfall somewhere, but rather that there is a whole lot of complexity in those 30K genes still waiting for evolution to explore.

    ReplyDelete
  9. Thank you for cutting right to the heart of the matter. I never did understand why it was expected that a human should have more genes than a fruit fly. Now I think I know why.

    ReplyDelete
  10. So my guess is that our "increased complexity" is due to how our genes are turned on by DNA (and RNA) regulatory elements.

    Quite right: for example the genes for "building brains" are switched on longer in humans than in fruit flies, which makes humans have a larger brain than fruit flies... but this isn't exactly news! :)

    ReplyDelete
  11. apalazzo says,

    But probably the biggest difference is that our genetic program for specifying when and how any particular gene is turned on.

    I've posted a comment on The Daily Transcript but for the benefit of Sandwalk readers I'll repeat it here.

    Apalazzo, using my handy-dandy template you could say ....

    I believe that the Deflated Ego Problem is a real scientific problem. I propose that explanation number 5 will account for the fact that we have too few genes.

    ReplyDelete
  12. kat says,

    Assuming fruit flies don't have artificial splicing isn't safe... Dscam, a neural cell adhesion molecule, has more than 5,000 isoforms in Drosophila, and I imagine this isn't the only example.

    Yes, that's the point. All of the arguments are flawed because their proponents don't understand that they apply just as well to the "simple" species with a smaller number of genes.

    In order for any of these arguments to be useful there has to be a case for relative frequency. It's not good enough to simply list one example of a phenomenon and then make the leap to universality. In biology almost everything happens at least once. The key to doing good science is to recognize the difference between exceptions and rules.

    ReplyDelete
  13. Today's issue of Science has an article on this very subject:

    Working the (Gene Count) Numbers: Finally, a Firm Answer? by E. Pennisi

    ReplyDelete
  14. You'll have to buy the magazine—which I highly recommend in spite of its flaws—to find out the answer.

    Between last issue's "Neutral theory overturned!!!" idiocy, and this bordering-on-crankitude sloppyness, I'm feeling like not spending the money on the magazine, despite my enjoyment of earlier issues. I'm turning into a never-satisfied crotchety old man at the age of 29, though, so perhaps I should give it another shot.

    ReplyDelete
  15. Hey, I write for it! You could buy the magazine and just read my column.

    ReplyDelete
  16. It isn't clear that there is a problem. As PZ says, there is very little genetic difference between humans and chimps.

    Brains have to be regulated in the near percolation threshold (where an average of one downstream neuron firing for every upstream firing). Combine that with a big brain and hands for the brain to do something with and the end result is inevitable.

    How many Homo species have there been? Were they "special" too? When did the "specialness" evolve?

    I suspect a big part of it was placental abnormalities which caused oxidative stress in utero, which causes low NO, and causes neuronal hyperplasia. I think this is fundamentally what leads to Asperger's, and the "tool making" phenotype.

    All physiological processes are inherently non-linear. They are all coupled. Throw in hysteresis and feedback, and things can get even more complicated. Coupled non-linear systems are inherently chaotic and are too complicated to model with more than a few variables. A few thousand "extra" genes provide more than enough degrees of freedom to achieve extreme levels of complexity.

    What is surprising to me is the belief that the human genome should be more complicated some how. Things like metamorphosis of insects seem a lot more complicated than the development that humans go through. Controlling the morphology of multiple cell types simultaneously in diverse tissue compartments? With fluctuating temperatures?

    ReplyDelete
  17. In giving this a little more thought, the "problem" is completely non-existant. It derives solely from a wrong notion of what "complexity" is, and how biological systems achieve it.

    "Complexity" in biological systems doesn't arise from the number of genes, or the number of bits of DNA, but rather from their interactions. That is any meaninful concept of "complexity" is going to go not as n, but as n^2, or even n!.

    Of course, that simplistic approach only looks at binary interactions. There are ternary interactions, and more. A gene produces not a single protein, but that protein with multiple post translational modifications, not all mediated via other proteins.

    But then you have "potential" complexity (number of possible interations), and "actual" complexity (number of actual interactions). This latter number is the one that is "important", but is the more difficult to get at.

    25,000! is a larger than 20,000! (by a factor of some 10^21756). Is that "enough" extra complexity?

    ReplyDelete
  18. In principle, as few as 40 or so regulatory proteins would suffice to uniquely specify every cell in a human body. So the fact the we have 30,000 genes (give or take) doesn't mean that there is a paradoxical shortfall somewhere, but rather that there is a whole lot of complexity in those 30K genes still waiting for evolution to explore.

    Nah.

    Housekeeping alone (DNA replication, transcription, translation...) requires a lot more than 40 genes.

    Genes that are not used are not selected for, so they become pseudogenes very fast. Evolution has zero foresight; you don't carry around useless genes because your descendants could one day need them.

    ReplyDelete
  19. Sorry, overlooked the "regulatory proteins" part, but it doesn't change what I said. :-)

    ReplyDelete
  20. A couple things that occur to me, as a complete layman.

    First of all, don't most plants have genomes 10 times longer or more than most animals? One would think there's probably some kind of pressure that keeps the genome length down in animals, who are after all going to be a lot more sensitive to miscodings... A fruit fly with a whole extra chromosome is probably significantly crippled or nonviable, but a tree with an extra chromosome is just a funny-looking tree.

    I don't know how apt the analogy to computer programming is, but there's no necessary relation between program length and complexity -- complexity arises from the logical flow of the program (the if..then..elses and subroutine calls and GOTO 10s and so forth), not its sheer volume. It might be instructive to point out that the very same program can be made to do more or less complex things depending on its input, as well.

    ReplyDelete
  21. I'm with daedalus2u here. The "amazement" at the small number of genes simply reflects the extent to which people are still mired in the "one gene, one trait" fallacy. But when you look at cell biology, what you find is not this kind of simple picture where one protein does just one thing, but complicated combinatorial interactions. Once you start to think combinatorially, 25,000 or so seems like a huge number. Let's take a simple case. The GABA-A receptor has 5 subunits and about 10 subunit genes that are widely used (actually, the total number of genes closer to twice that, but let's be conservative). So potentially that gives you 10^5 possible combinations of the way those genes could potentially assemble into a receptor. Most of those permutations probably don't actually assemble in practice, but there is plenty of leeway for mutation to create new types of GABA-A receptors from existing genes without even the need for gene duplication.

    In fact, our number of genes is probably larger than it really needs to be. We don't have a lot of evolutionary pressure for a small genome. So a lot of those genes are probably there from history rather than necessity--evolutionary pathways going through gene duplication because it is a higher probability route to improved function than optimizing an existing protein to have two functions.

    ReplyDelete
  22. K. Signal Eingang asks,

    First of all, don't most plants have genomes 10 times longer or more than most animals?

    Mammalian genomes are pretty large to begin with but, yes, there are many flowering plants with much larger genomes. In most cases this seems to be due to the formation of hybrids where the new species results from the fusion of two other species. This doubles the size of the genome.

    Such cases are easy to recognize because there are two copies of every gene.

    ReplyDelete
  23. David Marjanović, I think you also neglected the "in principle" part of my comment. To elaborate a bit more, if we take the number of cells in a human to be about 1 trillion, and we draw a simple model whereby each and every cell is "defined" by a unique combination of one or more regulatory proteins, then 40 or so regulatory proteins is all you need to give each and every cell it's own unique fingerproint.

    That's a bare minimum. Sticking to this binary mode (yes or no), it is clear that the many thousands of regulatory factors (transcription factors, splicing factors, small RNAs, E3's, and the like) in eukaryotes afford combinatorial possibilities that exceed by countless orders of magnitude what we see in humans (and, I daresay, the entire biosphere).

    Factor in some gray - temporal variation, concentration-dependent processes, etc. - and the possibilities are breathtaking.

    So, if we had only 20 regulatory factors, we might have the makings of a paradox. We don't - have this few regulatory factors, or a paradox.

    ReplyDelete
  24. It seems that even in relation to brain scaling we are not so extraordinary.

    Herculano-Houzel S. The remarkable, yet not extraordinary, human brain as a scaled-up primate brain and its associated cost. Proc Natl Acad Sci U S A. 2012 Jun 26;109 Suppl 1:10661-8. doi: 10.1073/pnas.1201895109.

    http://www.youtube.com/watch?feature=player_embedded&v=ekKBDadvMrU

    ReplyDelete