Tuesday, October 08, 2013

Non-Darwinian Evolution in 1969: The Case for Junk DNA

I've been having a discussion with Elizabeth Liddle in the comments to: Barry Arrington, Junk DNA, and Why We Call Them Idiots . I think it's important to understand why scientists first started thinking that most of our genome is junk. It's important to understand that these scientists were not Darwinists and their predictions were not based on an understanding of natural selection.

Let's look at a famous paper by Jack Lester King and Thomas Hughes Jukes.1 The title of the paper is "Non-Darwinian Evolution" and it was published 44 years ago in the May 16, 1969 issue of Science [read it at: Science 164:788-798].

The subtitle of the paper is "Most evolutionary change in proteins may be due to neutral mutations and genetic drift" but that's not what I want to talk about. This paper is among the first to predict the presence of large amounts of junk DNA in our genome. King and Jukes didn't call it "junk"—that term was introduced by Susumu Ohno in 1972—but that doesn't matter. When King and Jukes talk about "superfluous DNA" they mean "junk."

Here's the relevant part of the paper ...
Different proteins evolve at different rates, and different sites within specific proteins evolve at different rates. It is possible that these differences reflect differential mutability of the DNA itself, but to us this seems unlikely. It is more likely that proteins, and sites within proteins, differ with regard to the stringency of their requirements. The average rate of evolutionary change as shown in Table 1 is 16 × 10-10 substitution per codon per species per year.

Kimura (4) has estimated, in agreement with Jukes (37), that total molecular evolution in vertebrate species proceeds at the rate of about one amino acid substitution every 2 years. Arguing that Darwinian evolution at that rate would require greater selection pressure than any species can afford, Kimura concluded that most amino acid changes must be due to the passive fixation of selectively neutral mutations.

While we tend to agree with this conclusion, there are several reasons for questioning the arguments on which it was based. Kimura's estimate was deliberately conservative in some respects. The estimate was based (i) on comparisons of the beta chains of horse and human hemoglobins, which appear to have about an average rate of evolutionary change; (ii) on studies of cytochrome c, which is a relatively slowly evolving protein; and (iii) on a minimum estimate based on unsequenced analysis of triosephosphate dehydrogenase, this probably being a gross underestimate of the true evolutionary rate for that enzyme. The average rate of evolution per codon in the completely sequenced proteins listed in Table 1 is five times Kimura's conservative underestimate. If the rate per codon is extrapolated to the entire haploid DNA genome of 4 × 109 nucleotide pairs, as has been done previously (4, 37), it would appear that mammalian evolution is proceeding at the rate of about two allele substitutions per year. In relatively long-lived mammals this may be 20 substitutions per species per generation; in the human species, this is an evolutionary rate of nearly 60 amino acid substitutions per generation, implying a genome mutation rate including 60 neutral amino acid substitutions per gamete. For several reasons this seems much too high.

For one thing, about 4 percent of base substitutions result in chain-terminating codons; 60 amino acid substitutions imply about three chain-terminating mutations per gamete. Most chain-terminating mutations, if they occur in structural genes, are lethal, or at least produce nonfunctional alleles which have to be eliminated through natural selection. No organism having three lethal or severely deleterious mutations per gamete can survive. In addition, frame-shift mutations, also lethal in structural genes, appear to occur about as frequently as chain-terminating mutations (30), and certainly some of the amino acid substitutions are lethal or biologically harmful. Indeed, as we attempt to demonstrate below, it is unlikely that more than about 10 percent of all mutations are selectively neutral.
Note that 44 years ago scientists had a pretty good understanding of mutation rate and they realized that many amino acid substitutions would be neutral. Nevertheless, some mutations are lethal and if the entire human genome consists of genes (coding) then the lethal mutation rate will be too high for survival. This is the genetic load argument.
A second error is the assumption that all or most mammalian DNA consists of structural genes. Older estimates (see 38) of maximum gene number in mammals rarely exceed 40,000 genes per haploid genome. If the average gene consists of 1000 nucleotide pairs, extrapolation from the estimated evolutionary rate of 16 × 10-10 substitution per codon per year gives one amino acid substitution per species per 50 years. This is a far more believable figure. But only 4 × 107 nucleotide pairs, or 1 percent of the mammalian genome, is thus accounted for. Either 99 percent of mammalian DNA is not true genetic material, in the sense that it is not capable of transmitting mutational changes which affect the phenotype, or 40,000 genes is a gross underestimate of the total gene number.

Rates of spontaneous mutation to recessive lethal and visible mutants in mammals are of the order of 10-6 to 10-5 per locus per generation (38). If there are 40,000 genes, the total rate of mutation to lethal or nonfunctional alleles would be between 4 and 40 percent per gamete. From this consideration alone, it is clear that there cannot be many more than 40,000 genes.
This begins the discussion about whether most of our genome is functional. The data seems to suggest that a large percentage of the human genome is immune to mutational changes.
In extensive studies of the spontaneous mutation rate of Drosophila melanogaster, the average lethal mutation rate was 3 × 10-6 per locus and 10-2 per genome (39). Thus, the fruit fly has about 3000 loci that are capable of mutating to lethal alleles. If only a third of all loci are capable of mutating to lethal alleles under laboratory conditions, there may be perhaps 10,000 Drosophila cistrons. If the average cistron size is 1000 nucleotides, this accounts for about 10 percent of Drosophila DNA (8), since drosophilas have much less DNA per cell than mammals have.
It's important to note that back in 1969, scientists had pretty good ideas about genome sizes in different species. It's also important to understand that scientists like King and Jukes were perfectly capable of combining information from a number of different species in order to reach a general conclusion that would be valuable for all species. That form of argument is not as widely practiced today.
There is more direct evidence for the existence of nongenetic DNA. Heterochromatin is known to be nearly devoid of specific genetic information, yet it accounts for about a third of the DNA of those species in which it is cytologically detectable. About 30 percent of mammalian DNA consists of highly repetitive sequences of unknown function (9). In some species there are varying numbers of supernumerary chromosomes that appear to be of no survival value to the organism.
Even 44 years ago, scientists knew about repetitive DNA and other genome sequences that were noncoding. This should dispel the notion that informed scientists ever ignorant about the composition of noncoding DNA.
Perhaps the most compelling argument for the existence of superfluous DNA is the wide range in the DNA content of vertebrate cells (40, 41). The average mammalian cell contains more than twice the DNA of the chicken cell and almost four times that of the cell of the gar pike. The cell of the bullfrog contains twice as much DNA as that of the toad, and two and a half times as much as that of a man, while the cell of a lungfish has a DNA content 17 times that of the human cell and almost 60 times that of the pike cell. Can it be that these wide divergences in DNA content reflect wide divergences in the number of functional genes? This hardly seems likely.
King and Jukes make three arguments for superfluous (junk) DNA: (1) genetic load, (2) known examples of repetitive, noncoding DNA, and (3) the C-value paradox. These are examples of positive arguments for the existence of substantial quantities of junk DNA in our genome. It should put to rest any claims that the existence of junk DNA was based entirely on ignorance.

King and Jukes were aware of counter-arguments. The most important of these was the claim that a substantial percentage of the genome is transcribed. Remember, this is 44 years ago.
On the other hand, a substantial proportion of mammalian DNA is capable of forming hybrids with specific messenger RNA in vitro (42). Possibly, as Callan suggests (40), numerous nonheritable copies of the essential genetic material are created anew each generation. These multiple copies would transmit specific information by way of messenger RNA, but would not be true genetic material in that they would not transmit information to future generations and would not be directly involved in evolutionary processes. Another important possibility is that much of mammalian DNA is involved in the complexities of the immune response (26).
As is the case today, the fact that more than 1% of the genome is transcribed was not going to falsify the idea that most of the human genome does not carry genetic information.

Keep in mind that the title of this paper is "Non-Darwinian Evolution." Neither King nor Jukes thought of themselves as strict "Darwinists." The idea of junk (superfluous) DNA was promoted by scientists who understood neutral mutations and random genetic drift. Then, as now, strict Darwinists did not like the idea that most of our genome is junk.

1. Jukes earned his Ph.D. in biochemistry in my department in 1933.


  1. As described in this post by Larry, the idea that most of the genome in species with high c-value has informational roles has been first questioned many decades ago and since then basically abandoned by most researchers studying the evolution of genome size. Therefore it makes little sense to keep refuting this idea in endless posts and discussions; it’s like beating a dead horse.

    Instead, I think it would make more sense to evaluate and discuss the ideas published by current experts in the field on the evolution of genome size and the potential roles of the so called “junk DNA’ (jDNA), such as Thomas Cavalier-Smith, Ford Doolittle and Ryan Gregory, which support non-informational functions for jDNA.

    Here is an excerpt from a paper on jDNA published this year by Ford Doolittle (Doolittle WF. 2013. Is junk DNA bunk? A critique of ENCODE”; Proc Natl Acad Sci U S A. 110:5294-300):

    “Cavalier-Smith (13, 20) called DNA’s structural and cell biological roles “nucleoskeletal,” considering C-value to be optimized by organism-level natural selection (13, 20). Gregory, now the principal C-value theorist, embraces a more “pluralistic, hierarchical approach” to what he calls “nucleotypic” function (11, 12, 17)”.

    And here are excerpts from Ryan Gregory’s papers:

    “Although some researchers continue to characterize much variation in genome size as a mere by-product of an intragenomic selfish DNA "free-for-all" there is increasing evidence for the primacy of selection in molding genome sizes via impacts on cell size and division rates” (Gregory TR, Hebert PD. 1999. The modulation of DNA content: proximate causes and ultimate consequences. Genome Res; 9(4):317-24).

    “These are the “nucleoskeletal” and “nucleotypic” theories which, though differing substantially in their specifics, both describe genome size variation as the outcome of selection via the intermediate of cell size” (Gregory TR. 2004. Insertion-deletion biases and the evolution of genome size. Gene, 324:15-34).

    Clearly, these scientists describe genome size variation as the outcome of selection via the intermediate of cell size and as emphasized by Doolittle in the conclusion of his paper: by developing a “larger theoretical framework, embracing informational and structural roles for DNA, neutral as well as adaptive causes of complexity, and selection as a multilevel phenomenon … much that we now call junk could then become functional.” (see reference above).

    Therefore, the current thinking by the scholars in the field of genome size evolution and c-value enigma, support the perspective that most of the genome in species with high c-value is functional.

    1. To clarify, selection on cell size does not necessarily mean that non-coding DNA has a cell size-related function. It is entirely possible for the relationship between genome size and cell size to result in constraints on genome expansion in some taxa (e.g., birds) but not others (e.g., amphibians), for example due to metabolic effects. It would not follow that large cells/genomes are "functional" in amphibians, but rather than they are simply not as constrained by selection against large cell size. This is still based on "nucleotypic" effects, but it does not lead to a conclusion that most non-coding DNA is functional as a determinant of cell size per se.

      I also think it is relevant to provide the full paragraph from Doolittle's review:

      "I submit that, up until now, junk has been used to denote DNA whose presence cannot reasonably be explained by natural selection at the level of the organism for encoded informational roles. There remain good reasons to believe that much of the DNA of many species fits this definition. Nevertheless, while still insisting on SE [selected effects] functionality, we might want to come up with new definitions of function and junk by (i) abandoning the distinction between informational and nucleoskeletal or nucleotypic roles for DNA, (ii) admitting that there may be strong selection for C-value as a determinant of many cell biological features, (iii) fully embracing hierarchical selection theory and acknowledging that different genomic features may have legitimate functions defined and in play at different levels, and (iv) expanding the SE definition of function to include traits that arise neutrally but are preserved by purifying selection (12). Much that we now call junk could then become functional. However, such a philosophically informed theoretical expansion is not what ENCODE, or at least those authors stressing the demise of junk, so far seem to have in mind."

    2. Ryan,

      You have spent much of your scientific career developing the nucleotypic hypothesis, so is good to have you trying to clarifying it.

      After reading your papers on the nucleotypic hypothesis, as well as those by Cavalier-Smith on the nucleoskeletal theory, I understood that both of them describe genome size variation as the outcome of selection via the intermediate of cell size. Also, it appears that Ford Doolittle has a similar understanding, as he writes that Cavalier-Smith proposed that C-value is optimized by organism-level natural selection, and that you embrace a more a more pluralistic, hierarchical approach that you apparently call “nucleotypic function”.

      It would be helpful if you could address the following issues:

      1. Does the nucleotypic hypothesis describe genome size variation as the outcome of selection via the intermediate of cell size, or not?

      2. What does “nucleotypic function” means?

      3. What functions do you think might be fulfilled by ‘junk DNA’ (jDNA) as a group of non-informational genomic sequences, if any?

      4. Do you agree with Doolittle’s conclusion that by developing a larger theoretical framework, embracing informational and structural roles for DNA, neutral as well as adaptive causes of complexity, and selection as a multilevel phenomenon, much that we now call jDNA could then become functional?

  2. Thinking about this strictly as a layman, when I first heard about "junk DNA" it seemed to me to very consistent with notion of random mutation which I considered one of the main principles of the Theory of Evolution. How could you not have some (non-lethal) junk passed on if mutations are random? (Whereas if there were no randomness, and evolution was controlled step by step by some outside agency, no junk need be created - if the outside agency were perfect.)

    Further, I could see natural selection playing some role in the amount of this junk, as suggested above, because it seemed to me that the amount of junk would a) act as a buffer against disabling mutations to necessary genes, and b) provide a lot of raw material for further mutation to randomly produce a new beneficial function.

    That is, if somehow there were two types of genetic material, one of which had a mechanism to eliminate all junk before passing itself on to succeeding generations and one which did not, natural selection would favor the latter type.

    However, the key to this is still the random nature of mutations, and as such, natural selection could not provide any fine control, in my intuitive view, because control and randomness are opposing qualities.So my guess is that some amount of junk DNA is good for evolution, but that natural selection will not control the percentage except within a broad range. If true, this was no doubt well-known by the experts, but I present it as what one layman thinks is reasonable based on general evolutionary principles.

    (Of course the empirical evidence determines what is true, not my semi-philosophical musings.)

    1. Certainly you need mutation to create DNA, but you wouldn't have any significant accumulation unless it were unaffected by selection. That was the surprise: that so much of the typical eukaryote genome was evolving neutrally. One might naively expect that there would be some cost to carrying junk around, if only the energy required to replicate it. But this cost seems not to be large enough to matter in eukaryotes (though it seems to in most prokaryotes, possibly because they replicate much more often).

      As far as we can tell, junk does not act as a buffer against mutation. Almost all mutations have a frequency that depends on the length of sequence, not per genome. It's not as if a set number of mutations were fired at the genome randomly, but that's what a buffering function would require. (Claudiu will gladly tell you about exceptions.) Nor is there any reason to think that selection can operate to preserve junk on the off-chance that some of it may some day be useful. This would only work if the value of junk were displayed frequently enough so that individuals with more junk commonly had a reproductive advantage over those with less. Unlikely.

    2. Jim,
      Your thinking “strictly as a layman” might be right there with the best in the field, if not a step ahead.

      You might be aware of situations in which “the value of junk were displayed frequently enough so that individuals with more junk commonly had a reproductive advantage over those with less”. Think of the primary source for the origin of jDNA.

    3. Jim,

      Claudiu thinks you're really smart because you present the appearance of agreeing with his main obsession. Enjoy it.

    4. I'll stop displaying my ignorance - soon. Meanwhile, thanks for the polite comments. It is a very good point that the buffer hypothesis depends on what the mechanism for mutation is, and that if empirically the rate is the same for all genes there is no advantage to having a buffer. I can conceive of mechanisms which might not work that way, but they are probably not realisitic. Mutations caused by radiation (e.g., the three-eyed fish near the nuclear power plant in "The Simpsons") (joke) such that a neutron or gamma ray travels through a body until it hits something might be an example where the buffer would work.

      It still seems to me that random mutations are bound to cause a lot of junk, and that in order for natural selection to eliminate junk it would also have to constrain the passing on of mutations, which would be self-defeating.

      I am wasting too much time of experts and will stop commenting for a while. Thanks again for the replies.

    5. Radiation is a fine example of a cause of mutation that acts on a per-nucleotide basis. Bigger genomes are bigger targets, and radiation would be evenly distributed over the sequence, not diluted by a larger genome. It isn't just one neutron. No buffer.

      Yes, random mutation would cause a lot of junk. But I don't understand why you think selection to eliminate it would be self-defeating. Selection is, by definition, acting for immediate advantage. Perhaps you think the long-term benefits of junk DNA would make getting rid of it a bad idea over the long run. But that wouldn't stop selection from eliminating it if there were short-term disadvantages. You would have to appeal to group selection to rescue the case, always a difficult sell.

    6. Jim,

      Infusing research and scientific discussions with common sense is great asset. In fact, according to Peter Medawar: “The scientific method is a potentiation of common sense, exercised with a specially firm determination not to persist in error”.

      I understand that it is not easy to hang around when often discussions deteriorate into never-ending theological runts disguised as scientific or religious arguments, or into personal insults, but you have to realize that: it is these spins that keep discussions alive and the blogs rolling.

      Regarding the ‘buffer hypothesis’, when it comes to radiation and some other mutagens, John is right, the hypothesis doesn’t hold true. That’s why it was abandoned a long time ago. However, as you implied, there might be some other kinds of ‘mutations’ to which it applies.


      I expected that you would address the point that I made about your remark on ‘the value of junk’.

    7. Claudiu,

      I don't know what you were talking about, unless you are confusing what "individual" means. Clearly, there is selection (of a sort) among retroelements for increased insertion. But that has nothing to do with selection at the level we need here. I'm willing to read your explanation of what you meant.

      I suspect that to you, "common sense" means agreement with you.

    8. Consider 2 individuals or populations that have different amounts of the so called ‘junk DNA’ (jDNA) and are exposed to random (i.e. sequence nonspecific) insertional events by endogenous or exogenous retroviral elements, both at the germ line and somatic levels. As we know, although some insertional mutations can be beneficial most are deleterious, some more than others. Statistically, which individual or population would have an advantage against insertional mutagenesis?

    9. I find it difficult to believe that there would be any significant selection from having a little more junk DNA than your neighbor. Of course most insertional mutations are neutral, not deleterious. Group selection would be even weaker.

      It's so convenient that under your theory the disease and its cure are one and the same. Insertion makes the genome bigger, which protects from insertion. Evidence: species with a lot of insertions have bigger genomes! What could be neater, or more hermetically sealed?

    10. John says: Of course most insertional mutations are neutral, not deleterious.

      That’s only true in organisms with large quantities of jDNA. That’s the point of my hypothesis.

      I assume you are familiar with the CRISPR/Cas immune system of bacteria and archaea (1), in which the amount of viral DNA sequences co-opted as antiviral defense mechanism depends on viral activity. Indeed, what could be neater, or more hermetically sealed?

      1. Horvath P, Barrangou R. "CRISPR/Cas, the immune system of bacteria and archaea". Science 327: 167–70. 2010.

    11. My understanding is that you were talking about eukaryote genomes. Not so?

    12. I used the CRISPR/Cas antiviral defense mechanism as an analogous example in which “the disease and its cure are one and the same".

  3. I noted this in a previous thread and I will do it again - from the point of view of someone who has entered the field in the early 21st century, it's absolutely amazing how much people in the 1960s and 1970s got right based on so little data, just equipped with basic quantitative evolutionary framework developed in the premolecular days.

    Contrast that with all the bombastic press releases of the modern era, and all the creationist lies about how evolution has been disproven, etc....

    1. Excellent points, Georgi. The scientific press - especially the internet 'press' - has, I think, caused more problems than solved.

  4. I think the world would be different now if we only had the paper by King & Jukes, and not the paper by Kimura. King & Jukes made an empirical argument for a view much more in keeping with the neo-mutationist thinking that had been emerging among biochemists since Anfinsen and others in the late 1950s. Kimura, a theoretician, based his claim on a weak theoretical argument. He focused very much on random genetic drift, and happily joined a strange historic compromise in which Mayr, Simpson, et al framed the emerging threat to neo-Darwinism as an issue of "molecular" evolution, as though the evolution of molecules happened in a separate universe from the evolution of phenotypes.

    Prior to the molecular revolution, Haldane had argued that enzymes-- as opposed to superficial phenotypes-- would reveal more directly the true causes of evolution. After the molecular revolution, Mayr and Simpson argued precisely the opposite, that molecular comparisons revealed a superficial and indirect window on evolution.

    That is, at the same time that Mayr, et al were congratulating themselves for having "unified biology" with neo-Darwinism, they were arguing that "molecular" evolution is different, but that this is no threat to neo-Darwinism, which remains the default view of evolution, because it applies at a different "level." It makes my head spin.

    1. I don' t think most people realize that committed Darwinists were opposed to Neutral Theory. Ernst Mayr wrote a book called "What Evolution Is" (2001 !!!) in which he said) page 199) ...

      Molecular genetics has found that mutations frequently occur in which the new allele produces no change in the fitness of the phenotype. Kimura (1983) has called the occurrence of such mutations neutral evolution, and other authors have referred to it as non-Darwinian evolution. Both terms are misleading. Evolution involves the fitness of individuals and populations, not of genes. When a genotype, favored by selection, carries along as hitchhikers a few newly arisen and strictly neutral alleles, it has no influence on evolution. This may be called evolutionary "noise," but it is not evolution.

      We talked about this in class today. All of my students have passed a university course on evolution. Many of them just completed it last year. I asked them to define evolution and 74% of them defined it in terms of adaptation and natural selection. I asked them to name the main mechanisms of evolution and only 18% mentioned random genetic drift.

      In class, several students defended Mayr's concept of evolution. Almost all of them were surprised to learn that molecular phylogenies rely on fixation of nearly neutral alleles. Only two out of 34 had read the "Spandrels" paper. (By next Tuesday they all have to read it.)

      Is it any wonder that the average person doesn't understand modern evolution? It isn't even taught properly in university. Is it any wonder that the creationists misunderstand it?

  5. Larry,

    In regards to the point that you made at the beginning of your article, I recently came across this old letter that you might find interesting. The letter was written in 1979 by Thomas Jukes and directed to Francis Crick. Here are the opening sentences:

    "I am sure that you realize how frightfully angry a lot of people will be if you say that much of the DNA is junk. The geneticists will be angry because they think that DNA is sacred. The Darwinian evolutionists will be outraged because they believe every change in DNA that is accepted in evolution is necessarily an adaptive change. To suggest anything else is an insult to the sacred memory of Darwin." (emphasis added)

    The letter is really revealing, and it reflects many of the ideas that have been floating around at the time. Jukes specifically mentions the fact that Roy Britten believed that junk DNA "has a regulatory function." This is what John Mattick is arguing for today.