Wednesday, February 28, 2018

Junk DNA and selfish DNA

Selfish DNA is a term that became popular with the publication of a series of papers in Nature in 1980. The authors were referring to viruses and transposons that insert themselves into a genome where they exist solely for the purposes of propagating themselves. These selfish DNA sequences are often thought, incorrectly, to be the same as the Selfish Genes of Richard Dawkins1 [Selfish genes and transposons]. In fact, "selfish genes" refers to the idea that some DNAs enhance fitness and the frequency of these genes will increase in a population through their effect on the vehicle that carries them. It's an adaptationist view of evolution. The selfish DNA of transposons and viruses is quite different. These sequences only propagate themselves—the fitness of the organism is largely irrelevant. These elements do not contribute directly to the adaptive evolution of the species.

Transposons and integrated viruses are are subjected to mutation just like the rest of the genome. Deleterious mutations cannot be purged by natural selection because inactivating a transposon has no effect on the fitness of the organism.2 As a result, large genomes are littered with defective transposons and bits and pieces of dead transposons. This is not selfish DNA by any definition. It is junk DNA [What's in Your Genome?].

It's important to remember that real selfish DNA makes up only a tiny percentage of the human genome. This is a fact that was not widely known in 1980 although some of the discussion back then alluded to the possibility.

This brings me to a recent article by Itai Yanai and Martin Lercher [Life doesn’t make trash]. They are the authors of The Society of Genes. I wrote a short review of this book where I said that my main beef was their over-emphasis on The Selfish Gene and their adaptationist approach to evolution [Human genome books].

The article in Eon continues the emphasis on selfish genes and adaptation. Read it yourself to see if Yanai and Lercher are adaptationists or not.

Most of the article is about junk DNA. You have to read very carefully to see that the authors have gotten the basic facts correct. They conclude that about 10% of our genome is functional based on the criterion that it is conserved—although I'm not sure that point comes across very clearly. They say,
There is good evidence for this 10 per cent. If we compare our genome to that of other mammals, we find that 90 per cent of the genome was free to change through random mutations. Those DNA letters apparently did not contribute to the efficiency of the survival machine, us. By contrast, mutations in the remaining 10 per cent were weeded out by natural selection because they would have compromised the DNA sequences’ ability to spread – either by damaging the survival machine’s functioning, or by reducing the sequences’ freeloading capacity. This is the definition of function that has traditionally been used by evolutionary biologists as well as by philosophers of science: if something is conserved by natural selection, then it is functional. Function, then, is identified as the feature that ensures the spread or maintenance of a particular DNA sequence.
So far, so good. I disagree with their description of the rest of the genome. They imply that most of it is selfish DNA composed of transposons like Alu's and LINE-1 sequences. I wish they had put more emphasis on the fact that much of our genome consists of defective transposons and viruses that are junk, plain and simple. They aren't selfish DNA today, although they once were in the past.

1. The confusion stems from the fact that Dawkins briefly mentioned these selfish DNAs in his book The Selfish Gene.

2. Strictly speaking, this isn't true. There may be some fitness advantage to eliminating transposons. In species with large populations, this small fitness advantage can lead to small genomes. That explains why most bacteria are not littered with defective transposons.


  1. Note also Hamilton's original ethological coinage for selfish behaviour.

    [prediction: cue a certain somebody to claim Dawkins didn't invent selfish genes]

  2. Dawkins' idea about "selfish genes" is actually that such genes benefit *themselves* - their *own* fitness and reproductive success. He carefully distingishes this from the idea that genes exist to benefit organisms by giving examples of segregation distorters, where the genes benefit, but their associated organisms are hampered.

    The idea isn't "adaptationist" - unless combined with the claim that *all* genes are selfish. I don't think I have ever heard that claim asserted by anyone.

  3. I suspect I'm not that "certain somebody", but I can point out that genes that were selfish, favoring their own replication without benefiting the replication of the organism were known well before Dawkins's book. Notably, the segregation distortion loci found in Drosophila and in mice, known in the late 1950s and early 1960s. (Not to mention Barbara McClintock's case in corn that turned out to be a transposon).

  4. But much of the junk DNA in large genomes got there by being selfish DNA. Do we want to say that the inactivating mutations convert it from selfish DNA into junk DNA? Or will we allow it to be both selfish (from the perspective of how it cam to be there) and junk (from the perspective of why it's still there)?

    1. Mutations convert genes into pseudogenes and pseudogenes are junk. Similarly, mutations convert active transposons into junk DNA.

  5. Junk = not under selection at the organismal level. Active retroposons are selfish junk.

  6. Not all biochem books are 100% negative on Alus. For example Lehninger, Principles by Nelson and Cox:

    "The abundant Alu elements offer many opportunities for intramolecular base pairing within the transcripts, providing the duplex targets required by the ADARs. Some of the editing affects the coding sequences of genes. Defects in ADAR function have been associated with a variety of human neurological conditions, including amyotrophic lateral sclerosis (ALS), epilepsy, and major depression."

    There is alternative splicing in the Glutamate Receptors in addition to A-to-I editing. I saw pictures of slides of differentially expressed Glutamate receptors of alternative splice concentration in the same cell. The alternative splices and A-to-I edits affect ion channel performance and these are differentially expressed on cell and tissue types, especially neurons.

    I think it is way to early to be declaring Alus as junk.

    1. "I think it is way to early to be declaring Alus as junk."

      As has been explained to you many time before, nobody has claimed that all ALUs without exception, are junk.

      Also, the mere observation of A-to-I edits doesn't demonstrate it has biological function.

      It is involved in some biochemical activity =/= it is functional.

      Differential expression of transcripts was the whole basis of the fatuous and premature ENCODE declaration, so that also doesn't get you to where you want to go.

  7. Mikkel Rumraket Rasmussen,

    Are you the same Rumraket at TSZ?

    I should have added. Alus are implicated in the regulation and triggering of alternative splicing, that's why I mentioned alternative splicing. It was an indirect reference to Alus and Nelson and Cox alluded to it later in the section which I quoted

    25% of CpG sites are inside of Alus. If we, by way of analogy, notice that Alpha satellite tandem repeats provide targets/navigation for histone modification, maybe Alus do the same. I'm not aware of any studies on this yet, but I'm keeping my eyes open for this.

    We also do not yet know the role of transcribed Alus in the Epitranscriptome

    1. "Epitranscriptome"? Are you inventing new bad -omics by yourself, or did you see that somewhere? Your wordsaladome seems to be growing.

    2. No.

    3. OK, then your sin lies not in invention but in adoption and perpetuation. Still a sin.

    4. From PLOS 1, 2015 this is one of the first papers to connect Alus to the Epitranscriptome. I saw this train coming 3 years ago. Now it's coming through. Why did I anticipate this? 88 megabytes seems too little info to make something as complex as a human. That's what would be the case if most heritable information is in the DNA and 85% of the DNA is junk. That's just kinda hard to believe.

      Any way here is one connection of the Alu with the Epitranscriptome:

    5. Your appeals to intuition and historical rationalizations are boring, and they don't even constitute arguments.

      However things "seem" to you is of no consequence.

    6. Salvador Cordova (liarsfordarwin) says,

      Why did I anticipate this? 88 megabytes seems too little info to make something as complex as a human. That's what would be the case if most heritable information is in the DNA and 85% of the DNA is junk. That's just kinda hard to believe.

      There are lots of things you find hard to believe including some things that are scientific facts.

      But, putting that aside, let's look at what you believe about the size of the human genome. You say that 15% of the genome is 88 Mb. That corresponds to a genome size of 587 Mb.

      The actual genome size is 3,200 Mb so you are out by a factor of more than five. If we correct your calculation to a functional part of the genome corresponding to 480 Mb, does that make it more believable to you?

      [The actual amount of functional DNA is closer to 10% or 320 Mb.]

    7. My bad. I thought Salvador meant 88 Mb when he said 88 megabytes. Now I see that he actually meant bytes of information. Apparently the human genome can contain 6.4 giga bits of Shannon information. Who knew?

  8. "The actual amount of functional DNA is closer to 10% or 320 Mb"

    But 1 byte is 8 bits! 3.2 giga bases is 6.4 giga bits of Shannon information (2 bits of information for 4 possible states of A,C,T,G in on position). A megabyte is 1024*1024*8 bits. 1024 = 2^10 which is a kilobit, and

    10% * 6,400,000,000 / (1024*1024*8) = 76 megabytes

    I guess erred by being too generous. 76 megabytes is small. Lots of complex machines in the man-made world would be hard pressed to function with such a small amount of memory. Hard to believe something as complex as a brain that can self assemble, self-heal, and build computers can be implemented with 76 megabytes.

    1. "Hard to believe something as complex as a brain that can self assemble, self-heal, and build computers can be implemented with 76 megabytes."

      Again, how difficult you find it to believe, or comprehend, is not an argument, or evidence one way or the other.

    2. Apologies to the readers for my replies going to the wrong spot.

      "Again, how difficult you find it to believe, or comprehend, is not an argument, or evidence one way or the other. "

      Agreed, but it's a moot point if the bio-industry finds more and more polyconstrained function in sections of DNA previously deemed junk. Nothing wrong with waiting and seeing, but I see some people quite eager to rule definitively that around 90% of DNA is junk.

      I think "wait and see" can't hurt. A lot of the sentiment at the NIH is that these regions are functional because a lot of disease is in places we've dismissed like tandem and dispersed repetitive elements. For example the D4Z4 tandem repeat. It can tolerate some variation, but as some point enough deletions of the repeated units result in Muscular Dystrophy. We really know very little to be making a rush to judgement one way or the other.

      Once upon a time, there was sentiment even among creationists that Alus were junk. That has slowly changed. This is one of my essays on the matter:

      Now that I'm seeing A-to-I edits and Aleternative Splices on individual ion channels on individual neurons, and these protein variations are possibly influenced by Alus, I think there has been a rush to judgement on at least the Alus, which are about 10% of the human genome.