Sunday, September 20, 2015

Does genome size affect fitness in seed beetles?

Many of us think that the C-Value Paradox isn't really a paradox any more. We think that the variations in genome size among different species can be explained because expansion and contraction of genome size is mostly neutral with respect to evolution and thus the differing sizes of genomes in different species is explained by random genetic drift. Only a small percentage of most eukaryotic genomes is actually functional and the rest is junk. In the case of the human genome, about 90% is junk DNA.

Some scientists aren't happy with this explanation of the C-Value Paradox so they have come up with other explanations to account for the differences in genome sizes. A recent paper by Arnqvist et al. (2015) suggests that genome size affects reproductive fitness in seed beetles.

The introduction to their paper is a nice summary of the controversy ...
The general lack of correspondence between nuclear genome size (hence, GS) and organismal complexity is a classic problem in evolutionary biology [1,2]. Current hypotheses for the evolution of GS all rely on balancing forces which act to expand or to reduce GS. They can be broadly categorized into three non-mutually exclusive classes. First, the ‘junk DNA’ hypothesis recognizes that the selfish intragenomic propagation of transposons and other mobile genetic elements leads to the accumulation of mutations throughout the genome, yielding a one-way ticket to genomic obesity [3]. Such slightly deleterious mutations are then purged by very weak negative natural selection at the individual level [4–6], and the efficacy by which selection can rid the genome of mutationally hazardous DNA increases with increasing effective population size [7]. Second, the ‘selection hypothesis’ suggests that genomic reconfigurations associated with variation in GS has consequences for organismal fitness and that GS may, to a large extent, represent a dynamic balance between positive and negative selection on GS [8]. This could come about in many ways. For example, this hypothesis integrates the adaptive significance of gene duplication [9,10] and recent revaluations of the concept of ‘junk DNA’, suggesting that at least part of what was traditionally considered non-functional DNA may in fact have important effects on phenotypes [2,11–14]. Thirdly, a few other hypotheses do not involve natural selection on GS. For example, the ‘mutational equilibrium hypothesis’ suggests that GS represents a dynamic balance between DNA gain through large insertions and loss through small deletions, the rates of which are assumed to scale with GS [15]. Similarly, non-random assortative segregation by chromosome size during meiosis may, under some conditions, affect the evolution of GS [16].
It's hard to get an experimental handle on adaptive explanations because there are very few species where variation in genome size within a species is significant enough to test for fitness differences.

& Junk DNA
There are 1350 known species of seed beetles1. These insects lay their eggs in the seed of a legumous plant and the entire development of the insect (egg-larvae-pupa-adult) takes place within the seed. The authors looked at genome sizes in 12 different species of seed beetles and found that the size of the genome ranged from 704 Mb to 1486 Mb. (The human genome is 3,200 Mb.) It's unlikely that this two-fold difference reflects a difference in complexity since the two-fold variation occurs even within a genus (Callosobruchus). There was no correlation between genome size and population size, body size, or egg size.

Within the species Callosobruchus maculatus (cowpea seed beetle) there was a 4-5% difference in genome size among different populations that have been bred in the laboratory for various times over the past 40 years. This suggested a way to examine the effect of genome size in fitness by mating males and females with different sizes of genomes and measuring the number of offspring. Genome size within the species did not correlate with body size, development time, or growth rate but it did correlate with male and female reproductive success.

I've reproduced the key figure from the paper although I modified it somewhat to extend the x-axes to zero.

Their results show that both males and females with larger genomes are more successful at reproducing. There's a lot of scatter in the figure but the authors assure us that the correlation is significant.

Notice that the data for males (open circles) extrapolates back to zero fertilization success at a genome size of about 970 Mb. This seems a little strange since there are quite a few species of seed beetle that have genomes smaller than this. Presumably this will be investigated in future studies.

In addition to presenting positive evidence of selection for genome size, the authors discuss what they think is the lack of evidence for junk DNA. They note that there is no correlation between genome size and population size in the seed beetles. If the junk DNA advocates are correct then there should be a correlation, Arnqvist et al. according to

Here's the conclusion ...
In conclusion, we show that GS varies markedly both between and within seed beetle species and that GS shows rapid and bidirectional evolution. The pattern of evolution of GS is not consistent with a major role for genetic drift in shaping GS and GS did not show correlated evolution with estimates of species-specific relative population size. Within species, GS showed correlated evolution with both male and female reproductive fitness. Collectively, thus, our findings provide novel support for the hypothesis that GS variation results from natural selection in this clade.

What could cause a correlation between genome size and reproductive fitness? The authors suggest several reasons based on speculations by others about the possible function of excess DNA.
The observed positive link between GS and reproductive fitness could arise for several reasons, which are not mutually exclusive. For example, differences in the amount of non-coding DNA may reflect differences in the ability to regulate and fine-tune gene expression, such that larger genomes may be better able to produce phenotypes in high condition under a wider range of environments [2]. Further, differences in the amount of coding DNA may reflect adaptive gene duplication [9], allowing genotypes with larger genomes to be better buffered against deleterious mutations or otherwise be able to produce more adaptive phenotypes [10,54]. Yet another possibility is that variation in telomere length is positively related to reproductive fitness [55]. Alternatively, populations undergoing frequent laboratory bottlenecks may both be purged of deleterious alleles (due to inbreeding) and show larger GS (due to an increased importance of drift). We note, however, that laboratory populations of C. maculatus harbour genetic loads similar in magnitude to both wild populations and other seed beetle species [56,57] and the fact that we found no relationship between years-in-the-laboratory and GS offers no support for this scenario. Disentangling the above possibilities is currently not possible, as the relative contribution of coding and non-coding DNA to the GS variation seen in C. maculatus is unknown. Irrespective of the precise molecular causes of the documented association between genotype size and reproductive fitness, however, our results imply that natural selection acts on GS variation in our model system.
The Onion Test
by Ryan Gregory
I don't find the evidence convincing. Furthermore, even if there were a genuine correlation between genome size and reproductive fitness in this species of seed beetle, I don't think you can reasonably extrapolate this to onions.

None of the explanations make much sense and one of them (telomere length) is silly. I recognize that you don't need an explanation if the data is true—"unknown" is an acceptable answer—but we know enough about genomes and molecular biology to say that a cause and effect relationship is unlikely.

1. Recall that God has a inordinate fondness for beetles [J.B.S. Haldane].

Arnqvist, G., Sayadi, A., Immonen, E., Hotzy, C., Rankin, D., Tuda, M., Hjelmen, C.E., and Johnston, J.S. (2015 Genome size correlates with reproductive fitness in seed beetles. Proc. Roy. Soc. (UK) B published online Sept. 9, 2015 [doi: 10.1098/rspb.2015.1421]


  1. "The general lack of correspondence between nuclear genome size (hence, GS) and organismal complexity is a classic problem in evolutionary biology [1,2]."

    Que creationist misrepresentation in 3... 2... 1...

    1. It took about 55 minutes, but it is Sunday after all.

  2. I agree that the data does not look, well, significantly significant. It looks like each circle represents the genome and reproductive success of individual beetles, and with such a small sample, widely scattered as it is, the claim that this shows anything is dubious. Random coin tosses could easily give a better result. Even if the scattered data was claimed to show no correlation, consistent with genome size being essentially neutral, well, I would not be convinced of that either with a scattered and small sample.
    On the other hand, larger genomes might have a competitive advantage in males, since it is possible that large genomes might mean larger sperm heads. I do not know if this is so, but it seems reasonable and maybe someone should look into it. In the insect world extremely large and long sperm is thought to give males a competitive edge. There are lots of insects with sperm that are [cue Donald Trump voice] HUUUUGE. So maybe big sperm heads are competitive too.

    1. It looks like each circle represents the genome and reproductive success of individual beetles
      No they are distinct populations from different geographical regions.
      The thing that bothers me is that, given the variation within and between species, selection has to drive genome size down from time to time

  3. So how long have these populations had their current sizes? Long enough for selection to affect major changes in genome size?

  4. Larry

    1. Recall that God has a inordinate fondness for beetles

    1350 species of beetles but still one KIND-beetles are still beetles and not grasshoppers, no matter how one looks at it.

    Are there really 1350 species of beetles considering the fact that scientists still struggle to agree what kind of criteria separate one species from the others?

    I agree that the authors are clueless and it is obvious that study was a waste.

    1. Your numbers are off by two orders of magnitude.

      Some 25% of all animal life are beetles. 300k to 400k species in 500 families. Which is interesting since (IIRC) it was AiG that used "family" as a "kind"... at least when dealing with mammals.

      Of course, one must consider exactly how much evolution took place since the ark if there was only one species of beetle on the ark. That's even more impressive than the HLA alleles in humans. It'd be a new species every two months.

    2. 1350 species of beetles

      There are about 1350 (known) species of seed beetles, which are just one subfamily among many in the huge family of leaf beetles (35,000 species in more than 2,500 genera). The number of known species of beetles (Coleoptera) is over 400,000, and new ones are being discovered all the time. There may be millions of them waiting to be described. They are much more varied than any mammalian order, which means that one beetle can be way more different from another than you are from a lemur. After all, both humans and lemurs are practically the same thing -- "still primates and not cats".

    3. 1350 species of beetles but still one KIND-beetles are still beetles and not grasshoppers, no matter how one looks at it.

      Sigh, yes Sceptical Mind, which is why they are all classified as beetles. When you do spot a grasshopper, you are welcome to stand upon a soapbox in the town square and insist it not be called a beetle. Ironically, the only flak you are likely to experience are from fellow creationists who insist it is merely a slightly different "kind".

      I agree that the authors are clueless and it is obvious that study was a waste.

      Since you lack an understanding of most basic scientific concepts yet somehow know enough to mistrust them, your declaration that any scientific study is a waste of time amounts to the blanket statement you would apply to all scientific research. Unless it be that creationist sort of “scientific research” that relies upon “revealed truth” and “strongly held beliefs” to illuminate your shining path to real knowledge.

    4. It must have been another Darwinian miracle for evolution to work out all those variations of species with trial and error with so little time and no reason whatsoever to make the various species in the first place.

      What made evolution to create1350 species of beetles? It must have been a new and unknown driving force such as....

      I don't even have a concept, but I'm sure Darwin's worshipers will find something. Let the games begin!

    5. ... with so little time ...

      The earliest beetles are known from the Pennsylvanian (about 300 million years ago). They started diverging into the modern groups in the Triassic (more than 200 million years ago). Seed beetles (the subfamily Bruchinae) are known from the Late Cretaceous (79 million years ago). That's plenty of time. To produce 1350 species in 80 million years, they would only have to speciate every 7 million years or so, or a little more frequently to compensate for the extinction of some lineages. But insects are not easy to exterminate; they survive even mass extinctions rather well. This is one of the reasons why there are so many of them.

    6. In other words, simple genetic drift could accomplish speciation. Doesn't even have to be some major event.

      As far as Darwin, I know why you say things like that. It's because you can't understand that there is no high priest of evolution that everyone must follow. You can't understand that all scientists think for themselves and, I know, the majority of people that post here have studied the subject well beyond just Darwin and Dawkins' books. Some have, gasp, even read peer reviewed papers... actually, i think many have written those papers.

      Again, I understand that you can't deal with a non-authoritarian culture, but you could at least be bothered to understand it.

    7. no reason whatsoever to make the various species in the first place.

      Ah, so there must be a *reason*.

      Must there be a reason for the planets to circle the sun as well, or does gravity work?

  5. Any theory explaining the C-value paradox must address two critical issues:

    (i) the mechanisms and selective forces, if any, leading to the origin of the so called "junk DNA", which I call symbiotic DNA (sDNA), and

    (ii) the mechanisms and selective forces controlling the location and the quantity of sDNA sequences within the genome (

    Approximately half of the human genome consists of recognizable endogenous viruses and transposable elements, and much of the remaining sDNA is composed of remnants of these elements. Therefore, the mechanisms and selective forces behind the genesis of most sDNA sequences are associated primarily with the inserting elements, not with the host. So, the first issue is resolved.

    It is also clear that the genomic sites for the integration of sDNA sequences have been under strong selective pressure. So, that's also resolved.

    The remaining issue is that regarding the amount of sDNA. As I pointed out in the linked paper, it is clear that the accumulation of very large quantities of sDNA is also under selection constrains:

    "According to Graur et al., though, “In humans, there seems to be no selection against excess genomic baggage”. However, non-functional or parasitic DNA is under purifying selection in all organisms, although less in some than in others, and there is eloquent evidence on evolutionary constraints on very large genomes [23-25]; in other words “without selection against excess genomic baggage” the human genome might be much larger."

    According to the hypothetical model I proposed on the evolution of genome size, the amount of sDNA as an adaptive defense mechanism against insertion mutagenesis "varies from one species to another based on the rate of its origin, insertional mutagenesis activity, and evolutionary constraints on genome size."

    1. Does genome size correlate with the ERV insertion rates, published in

    2. "Approximately half of the human genome consists of recognizable endogenous viruses and transposable elements, and much of the remaining sDNA is composed of remnants of these elements. Therefore, the mechanisms and selective forces behind the genesis of most sDNA sequences are associated primarily with the inserting elements, not with the host. So, the first issue is resolved."

      Not really. The issue is that endogenous viruses (or better: retroviral-elements) are not the remnants of RNA viruses (as is generally believed), but they are rather the elements where RNA viruses find their origin.

      These retroelements (now knows as Transposable and Transposed Elements; short: TEs) should be renamed VIGEs (variation-inducing genetic elements), because that is their function in the genome. The create inheritable genetic variation. Most of it is not selectable.

      Have you ever thought about variation? It has a genetic background. VIGEs are the genetic elements that produce it. VIGEs also easily transform into RNA viruses.

      The eukaryotic harbours an elaborate genetic system to control the activity of VIGEs. VIGEs are predominantly active in the germline cells.

      Inducing variation in the offspring is a genetic trait.

      Frontloading rules.

    3. Why call them VIGEs? Yeah, sure, acronyms make it look sciency and all, but don't be so modest man, I say we name them after their amazing discoverer: how about "Borgies"?
      Frontloading your name in your terms just makes too much sense, doesn't it Borger?

    4. That's a testable hypothesis you've got there, Borger. You have a variation-inducing process which, you suppose, if 'for' the benefit of the organism. Prove it. Demonstrate that a model species with active TEs prospers when compared to controls with none. Demonstrate that TEs are nurtured by the organismal machinery instead of repressed. To the laboratory!

    5. Hmm... The data seems to show ERVs loosing their envelope protein, not vice-versa.

      One example off the pubmed search...

  6. I have a question about your usage of fold-change ("doubling is a two-fold change"). If I fold a paper once, I have doubled the thickness. If I fold it twice I have quadrupled the thickness. Three times, I have octupled the thickness. So a n-fold change = 2^n change in amount. Or, the fold change = log2(new amount / old amount). But above and in say microarray studies, the common usage seems to be fold change = (new amount / old amount). Indeed, I see the usage of "log2 fold change" to describe what I would think would simply be "fold change". So my question is, does this usage vary across fields? Or did the usage change historically? Or, is my idea of what should be correct usage simply wrong?

    Jeff Walker
    (this post is unmodified from what I deleted but didn't want to post pseudonomously)

    1. Fold used as "times" is unfortunately very common in molecular biology. Like "percent homologous" and all of the other odd language uses that somehow have fixed in that field (especially the "medically related" molecular biology).

    2. Ugh. Percent Homologous is the worst.