More Recent Comments

Thursday, August 01, 2013

The Junk DNA Controversy: John Mattick Defends Design

The failure to recognize the implications of the non-coding DNA will go down I think as the biggest mistakes in the history of molecular biology.

John Mattick
abc Australia
John Mattick has just published a paper dealing with the controversy over the ENCODE results and junk DNA. As you might imagine, Mattick defends the idea that most of our genome is functional. He attempts to explain why most of the critics are wrong.

The title of the paper is "The extent of functionality in the human genome" (Mattick and Dinger, 2013). It's published in the HUGO Journal. Recall that HUGO (Human Genome Organization) gave Mattick a prestigious award for his contributions to genome research. (See The Dark Matter Rises for a discussion of these contributions.)

UPDATE: Mike White also discusses this paper at: Having your cake and eating it: more arguments over human genome function.

Mattick's paper begins by mentioning three of the papers that were critical of ENCODE results: Dan Graur's paper (Graur et al. 2013), Ford Doolittle's paper (Doolittle, 2013), and the paper by Niu and Jiang (2013).

He begins by addressing one of Dan Graur's points about conservation.

Sequence Conservation

Let's cover a bit of background before dealing with Mattick.

Scientists have fifty years of experience looking at sequences. We have repeatedly observed that some sequences are conserved while other are not. Conserved sequences are remarkably correlated with functional regions of proteins, RNAs, and the genome. By contrast, non-conserved sequences almost always correlate with nucleic acid and amino acid sequences that are not essential for function. In the case of genomic sequences (DNA) these non-conserved sequences have often been tested and, with only a few exceptions, no evidence of function has been discovered. (Promoter bashing experiments are a good example.)

In a few cases, large regions of the genome have been deleted with no apparent effect on the organism. In other cases, considerable variation within a population is observed (e.g. humans) and the absence of some stretches of DNA in some individuals does not seem to affect these individuals. Thus, deleting non-conserved DNA doesn't appear to affect fitness suggesting strongly that it is nonfunctional.

& Junk DNA
Some parts of the human genome resemble functional genes and transposons but all available evidence indicates that these regions no longer function like the genes and transposons they resemble. They appear to be pseudogenes and defective transposons (or fragments of transposons). By looking at well-identified orthologs in different species we can see that these pseudogenes have gained fixed mutations at the rate perfectly consistent with the rate of fixation of neutral alleles by random genetic drift.

Whole genome comparisons of mammalian genes also demonstrate that 90% of their genomes are not conserved and are evolving as though the nucleotide sequence was irrelevant. (Rate of fixation equals the mutation rate.) This observation is also consistent with genetic load data showing that about 90% of our genome can't be constrained by negative selection.

It is reasonable to conclude that most of the typical mammalian genome is not functional. The only other possibility is that a large percentage of these genomes is functional but the function has nothing to do with the actual sequence of DNA.

John Mattick does not like this line of reasoning. He says ...
the substantive scientific argument of Graur et al. is based primarily on the apparent lack of sequence conservation of the vast majority (~90%) of the human genome, suggesting that this indicates lack of selective constraint (and therefore function). The fundamental flaw, however, in this argument is that conservation is relative, and its estimation in the human genome is largely based on the questionable proposition that transposable elements, which provide the major source of evolutionary plasticity and novelty (Brosius 1999), are largely non-functional. This argument also overlooks a number of other assumptions and considerations that are tacitly embedded in conservation comparisons and their interpretation (Pheasant and Mattick 2007) ...
Mattick raises five objections that, in his opinion, make the Graur et al. argument (and the data) invalid.

1. ... relative conservation implies function lack of discernible sequence imputes nothing.

This is nonsense. Lack of sequence conservation implies lack of function based on decades of work. It's true that there are some parts of the genome whose function doesn't depend on sequence but the burden of proof is on those who claim that most non-conserved sequences are functional.

2. regulatory sequences have more relaxed structure-function constraints than protein-coding sequences.

As a general rule, regulatory sequences consist of short conserved sequences that bind proteins. They are easy to identify and relatively easy to recognize. They are well conserved between related species. There may be a few exceptions but the general rule applies. Regulatory sequences are just as well conserved as typical amino acid codons in proteins. Mattick is wrong.

3. regulatory sequences are the main genetic substrates for the exploration of phenotypic diversity in animals.

It's true that phenotypic differences between species can often be explained by differences in otherwise conserved regulatory sequences.

4. the conclusion of lack of conservation of most of the human genome is largely based on a circular comparison with the rate of evolution of pan-mammalian ancient ‘repeats’

Mattick complains that the lack of conservation of genomic sequences is largely based on a circular argument. This is hard to understand. He says ...
... one assumes that a subset of the genome is evolving neutrally and is therefore indicative of the rate of unconstrained divergence, then finds that most of the rest of the genome is behaving similarly, which is therefore concluded to also be non-functional. If the first assumption is incorrect ... the derived conclusion of non-functionality of the rest of the genome is also incorrect.
The logic seems relatively uncontroversial. Mattick is correct. If one assumes that part of the genome is evolving neutrally then the conclusion will be invalid if the assumption is incorrect.

The problem is that we have plenty of evidence that most of the genome is evolving neutrally so it's not an assumption. It's a fact. Maybe I don't understand this argument?

5. even if ancient repeats are neutrally evolving (which we think unlikely), the extant comparison set is restricted to those whose orthology is recognizable ...

This is true. We can only determine that pseudogenes and defective transposons are evolving neutrally if we know that the DNA regions in different species are orthologous. Fortunately, we have plenty of excellent examples. These allow us to deduce the common ancestor and determine the rate of fixation of allele in each lineage. They serve as good examples of fixation of neutral alleles by random genetic drift.

I'm not sure I understand why this is so important to Mattick.

The C-Value Paradox

Mattick correctly identifies the main argument for junk DNA based on genome size comparisons.
... the so-called ‘C-value enigma’ , which refers to the fact that some organisms (like some amoebae, onions, some arthropods, and amphibians) have much more DNA per cell than humans, but cannot possibly be more developmentally or cognitively complex, implying that eukaryotic genomes can and do carry varying amounts of unnecessary baggage.
He argues that, while this may be true, the differences are often due to polyploidy or increases in the amount of defective transposon sequences. It's not clear to me why this invalidates the conclusion that some eukaryotes can carry a lot of junk in their genomes.

He then goes on to say ...
... there is a broadly consistent rise in the amount of non-protein-coding intergenic and intronic DNA with developmental complexity, a relationship that proves nothing but which suggests an association that can only be falsified by downward exceptions, of which there are none known.
That "correlation" only exists in the mind of John Mattick. He mentions "downward exceptions" and says that there are none known. I don't know what he means by this. Does he mean that the minimum size of a vertebrate genome is defined by the pufferfish, with about 27,000 genes and a total genome size of 0.33 ×109 or about 1/10 the size of the human genome.

Or does he mean the minimum size of the mammalian genome defined by the Bent-winged bat at approximately half the size of the human genome? Either way, the human genome must contain a lot of junk that isn't required to specify a complex vertebrate.

This argument doesn't make any sense.

Pervasive Transcription

We now come to the most important part of Mattick's defense of ENCODE. The question is whether pervasive transcription is a reflection of noise or whether the majority of the RNAs produced have a function. Keep in mind that most of these RNAs are complementary to defective transposon sequences and their sequence is not conserved. Also keep in mind that only a small percentage reach a concentration of at least one molecule per cell. (Mattick does NOT mention concentration.)

Mattick's main argument for function is ..
... the vast majority of the mammalian genome is differentially transcribed in precise cell-specific patterns (Mercer et al. 2008) to produce large numbers of intergenic, interlacing, antisense and intronic non-protein-coding RNAs, which show dynamic regulation in embryonal development ...
Let's think about that for a minute.

Let's assume that the human genome is littered by chance with short sequences that resemble transcription factor binding sites. This has to be true unless there is strong negative selection against anything that resembles the binding sites of any transcription factor. There's no conceivable way that this could happen so it follows logically that there will be spurious binding sites.

Transcription factors will blind to these spurious nonfunctional sites as long as the DNA is available for binding. In some cases the accidental binding of a transcription factor would lead to spurious, accidental, transcription.

In almost all cases, these spurious transcripts will be extremely rare—their concentration will be less than one transcript per cell. This is important since you can't have a serious discussion of this issue without considering concentration.

If our assumption is correct, there's one other feature of the spurious transcription that must be observed: the transcription will be cell specific or developmentally regulated. This is because different transcription factors are present in different cell types and at different stages of development. It's also because the accessibility of different parts of the genome vary from cell type to cell type and at different kinds of development. This is the transition from "open" chromatin to a "closed" version resembling heterochromatin.

We're left with the conclusion that spurious, accidental, transcripts must be differentially expressed as a function of cell type and development. That's exactly what we observe. But Mattick uses this necessary feature as an argument for function. That makes no sense.

He claims that ...
... differential expression (including extensive alternative splicing) of RNAs is a far more accurate guide to the functional content of the human genome than logically circular assessments of sequence conservation, or lack thereof. Assertions that the observed transcription represents random noise (tacitly or explicitly justified by reference to stochastic (‘noisy’) firing of known, legitimate promoters in bacteria and yeast), is more opinion than fact and difficult to reconcile with the exquisite precision of differential cell- and tissue-specific transcription in human cells.
I don't think it's fair to say that spurious transcription is "more opinion than fact." It's a biochemical necessity as long as you understand the properties of DNA binding proteins.

Mattick has one more argument up his sleeve and it's the same argument made by Intelligent Design Creationists.
Moreover, where tested, these noncoding RNAs usually show evidence of biological function in different developmental and disease contexts, with, by our estimate, hundreds of validated cases already published and many more en route, which is a big enough subset to draw broader conclusions about the likely functionality of the rest.
There are over one million different transcripts that have been detected in human cells. In some cases there have been obvious clues that these transcripts have a function. Many of these best candidates have been investigated and it turns out that quite a few have a function.

That's not a surprise. But just because there are functional RNAs does not mean that all RNAs are functional. It does not even mean that a substantial percentage are functional. (Remember that 300 functional RNAs out of one million is 0.03%.)

[Mattick has been] a true visionary in his field; he has demonstrated an extraordinary degree of perseverance and ingenuity in gradually proving his hypothesis over the course of 18 years.

Hugo Award Committee
A Question of Motives

Now we get to the end of the paper and the most astonishing claim. I had to read this several times before I sure I was interpreting it correctly.
There may also be another factor motivating the Graur et al. and related articles (van Bakel et al. 2010; Scanlan 2012), which is suggested by the sources and selection of quotations used at the beginning of the article, as well as in the use of the phrase “evolution-free gospel” in its title (Graur et al. 2013): the argument of a largely non-functional genome is invoked by some evolutionary theorists in the debate against the proposition of intelligent design of life on earth, particularly with respect to the origin of humanity. In essence, the argument posits that the presence of non-protein-coding or so-called ‘junk DNA’ that comprises >90% of the human genome is evidence for the accumulation of evolutionary debris by blind Darwinian evolution, and argues against intelligent design, as an intelligent designer would presumably not fill the human genetic instruction set with meaningless information (Dawkins 1986; Collins 2006). This argument is threatened in the face of growing functional indices of noncoding regions of the genome, with the latter reciprocally used in support of the notion of intelligent design and to challenge the conception that natural selection accounts for the existence of complex organisms (Behe 2003; Wells 2011).
The last two references are to Michael Behe's paper about functional pseudogenes and to Jonathan Wells' book The Myth of Junk DNA. I don't think I've ever see a legitimate scientific paper that references that book by Jonathan Wells.

Mattick also uses IDiot terminology when he says that, "...the argument posits that the presence of non-protein-coding or so-called ‘junk DNA’ that comprises >90% of the human genome is evidence for the accumulation of evolutionary debris by blind Darwinian evolution." As you should all know by now, the accumulation of junk DNA is the antithesis of "Darwinian evolution." You should also note that it's mostly IDiots who get confused about the difference between junk DNA and "non-protein-coding" DNA.

I find that very troubling.

Doolittle, W.F. (2013) Is junk DNA bunk? A critique of ENCODE. Proc Natl Acad Sci USA 110:5294-5300.

Graur D., Zheng Y., Price N., Azevedo R.B., Zufall R.A., and Elhaik E. (2013) On the immortality of television sets: “function” in the human genome according to the evolution-free gospel of ENCODE. Genome Biol Evol 5:578-590.

Mattick, J. S. and Dinger, M. E. (2013) The extent of functionality in the human genome. The HUGO Journal 7, 2 [doi: 10.1186/1877-6566-7-2] [Abstrat]

Niu, D.K. and Jiang, Ll (2013) Can ENCODE tell us how much junk DNA we carry in our genome? Biochem Biophys Res Commun 430:1340-1343.


SPARC said...

I've posted the following statement on the Hugo comment page accompanying Mattick's article:

Larry Moran's critique

Readers may be interested in Larry Moran's critique at his Sandwalk blog (

IMO T. Ryan Gregory's onon test cited by Mattick is not so much about the fact that onion genomes are bigger than the human genome but rather about the fact that the sizes of the smallest and the biggest onion genomes differ by a factor of 5. One would have to claim different complexities for these onion species if one beleaves that most sequences are functional.

It may take some time until it shows up because the following message poped up after I submitted my comment:

Your comment will be checked by a moderator, this should happen within 2 working days. You will receive an email when the comment appears on the site or if it is rejected by the moderator.

Mikkel Rumraket Rasmussen said...

"blind Darwinian evolution" is ID catchphrase. Hmmm.

SPARC said...

You will have to register if you want to comment over there.
If you have access to one of the following sites username and password should be the same:
BioMed Central
Chemistry Central
Current Controlled Trials
Cases Database

Matt G said...

Stealth Cdesign Proponentist? Wow, just wow! Somebody seems to have a hidden agenda. Has he ever made comments about ID or any other form of creationism?

Diogenes said...

Anyone who cites Jonathan Wells as an authority is a charlatan, pure and simple. We can't say for sure Mattick is a religiously-motivated IDiot, and his motivations don't really matter, but his citing of Jonathan Wells AS AN AUTHORITY proves Mattick is a charlatan.

It's all ID catchphrases.

so-called ‘junk DNA’ that comprises > 90% of the human genome is evidence for the accumulation of evolutionary debris by blind Darwinian evolution

Who the hell ever uses the word "Darwinian" in a scientific paper to mean neutral goddamn evolution? An IDiot, that's who. John Timmer quantified just how much SCIENTISTS JUST DON'T SAY "DARWINISM" in a review of IDiot Stephen Meyer's Explore Evolution.

Timmer writes: [In the scientific literature] "Searching for "neo-darwinism" netted 30 references; "neodarwinism" another five. Trying "neodarwinian" and "neo-darwinian" pulled out a whopping 96 references. The term appears to have no significant presence in scientific communications. In contrast, searching for "evolution" pulled out 226,476 papers, while the more specific "selective pressure" 21,553. If this book is all about science, why not use the terminology actual scientists do? Presumably, because the institute producing the book promotes the idea that evolution isn't science, but an ideology, one that their fellows have pulled a Godwin on and attempted to tie to Nazism." [Ars Technica vs. Stephen Meyer's Explore Evolution]

Mattick: "the argument posits that the presence of non-protein-coding or so-called ‘junk DNA’"

Again, only IDiots and muggle journalists say scientists believed ncDNA = junk DNA. As I have said over and over and over and over, no geneticist nor molecular biologist ever said that he himself believed that ncDNA was equal to junk DNA, or a subset thereof. This is a lie promoted by ID creationists and reporters in the muggle press who wanted to sell magazines with a fake Kuhnian "paradigm shaft" based on lying about what scientists believed 10-20 years ago, replacing their real hypothesis with something dumber and easier to disprove.

However, Mattick's profound, ocean-deep ignorance of the history of science is not the same as ignorance of science. One can be, like Mattick very, very, very, very ignorant of the recent history of science (and I mean 1980'-1990's, not 1880') while not necessarily being ignorant of science itself. But there are other indicators of this.

Mattick: "There may also be another factor motivating the Graur et al. and related articles (van Bakel et al. 2010; Scanlan 2012), which is suggested by the sources and selection of quotations used at the beginning of the article, as well as in the use of the phrase “evolution-free gospel” in its title (Graur et al. 2013): the argument of a largely non-functional genome is invoked by some evolutionary theorists in the debate against the proposition of intelligent design"

Fraudian psychoanalysis, which does not belong in a supposed journal of genomics! If Mattick wants to write papers on psychoanalysis, let him send them to Psychology Today! Fraudian psychoanalysis and proving your point by accusing people vastly smarter than you of having bad motives is a technique universally employed by creationists.

I have never, ever, ever, ever seen such psychoanalysis or armchair psychology employed in any peer-reviewed paper on genetics, molecular biology, physics, or anything scientific. It's like when Nazis like Johannes Stark blathered about "Jewish Physics". This is shameful, and HUGO journal should be ashamed. They should retract Mattick's paper, or they should rename themselves "Modern Amateur Psychoanalysis Journal."

Diogenes said...

Mattick: he substantive scientific argument of Graur et al. is based primarily on the apparent lack of sequence conservation of the vast majority (~90%) of the human genome, suggesting that this indicates lack of selective constraint (and therefore function). The fundamental flaw, however, in this argument is that conservation is relative, and its estimation in the human genome is largely based on the questionable proposition that transposable elements, which provide the major source of evolutionary plasticity and novelty (Brosius 1999), are largely non-functional.

This is patently false and Mattick shows his profound ignorance of sequence comparison methods!

The literature on sequence conservation rates goes back decades before modern genomics, before whole genomes could be sequenced. In the 1970's and early 1980's all molecular biologists had to work with were amino acid sequences, and they had to deduce neutral substitution rates by comparisions of CODING DNA, CODING DNA, NOT NON-CODING DNA and NOT TRANSPOSONS! The Dayhoff matrix was not constructed from damn transposons! What a blunder!

Molecular biologists would laugh out loud at Mattick's fantasies-- mol. biologists have decades of experience at actually MUTATING proteins and seeing what happens, and comparing that to sequence comparisons across many species. They know from experience, NOT "circular logic" as Mattick ignorantly claims, that sequence conservation is the most reliable, simple metric (if you don't have a 3-D protein structure) indicating functional or structural constraints.

Mattick here is guilty of "pot-kettle-black", because he himself is employing circular logic in asserting that cell-type-specific or developmentally specific expression of RNA's at incredibly low levels of concentration is the best evidence of the function. What evidence does he have to back that up? Statistics? Experimentation? Observation? No, he assumes it-- it's his hypothesis, and he presents his hypothesis as if it were data proving his hypothesis. Circular logic.

Hey, if the experimental methods don't support your hypothesis, says Mattick, the experimental methods must be wrong. Pick your methods so as to guarantee you get the result you need, says Mattick. He's Mr. Circular Logic.

Matt G said...

A most appropriate Freudian slip, Diogenes!

Diogenes said...

Here is yet another example of Mattick's circular logic!

"... there is a broadly consistent rise in the amount of non-protein-coding intergenic and intronic DNA with developmental complexity, a relationship that proves nothing but which suggests an association that can only be falsified by downward exceptions, of which there are none known."

Here Mattick is referring to his infamous “Dog’s Ass Plot” that was so effectively skewered by T. Ryan Gregory. That plot of Mattick’s was based on fake data– his diagonal slopes on the bars on the graph? made up numbers– and bordered on scientific fraud. But, having made a fake plot based on fake numbers, Mattick now treats his fake data as real.

And as for his claim of "downward exceptions", "there are none known", not only is this false, the use of the word "exception" must be challenged. The word "exception" implies that there's a rule-- but where's the rule? Mattick's Dog's Ass Plot?

Now we come to the circular logic. Who the hell decided that the Dog's Ass Plot could not be falsified by UPWARD exceptions!? Mattick decided that-- Mattick announced from atop lofty Olympus that no UPWARD exceptions can falsify his Dog's Ass Plot BECAUSE HE KNOWS THERE ARE DOZENS AND DOZENS, LIKELY MANY MANY MORE, UPWARD EXCEPTIONS. Again, it's circular logic because when the experimental evidence don't support Mattick's hypothesis, he announces 'hey, THOSE methods don't count'!!

This is like arguing with some Apple Computer fan circa 1995: "Well yeah, Apple computers suck at gaming, but computers aren't MEANT to be used for gaming!"

And as for his claim that downward exceptions don't EXIST, here's a bunch for ya, John!!

The Drosophila genome is very reduced in comparison to other flies. Its genome can be directly compared to other flies with, presumably, more junk.

In bladderworts [Genlisea-Utricularia] there is wide variation. Utricularia gibba is now famous for its fully sequenced, tiny genome, but the beauty part is, there are other bladderworts that are closely related but with far larger or even smaller genomes, that just beg to be sequenced. Utricularia prehensilis has 4.56 times as much DNA as U. gibba, while Genlisea hispidula with has 18.4 times as much as U. gibba.

Meanwhile, G. margaretae and G. aurea are even smaller than U. gibba!

I have been going about saying someone should write an NIH grant to sequence Utricularia prehensilis or Genlisea hispidula.

The sea urchin has 814 Mbp = 1/4 x Human.

The turkey has 1.1 Gbp, about 1/3 x human.

The frog Hyla nana has 1.89 pg C = 55% of human.

Let’s not forget the 100-fold variation within amphibians, from genomes much smaller (less than 1/3) than human, to Necturus Lewisi with 34 times bigger than human.

“An extraordinary range of C values is found in amphibians where the smallest genomes are just below 10^9 bp while the largest are almost 10^11 [100 billion basepairs, compared to 3.2 billion in humans]. It is hard to believe that this could reflect a 100-fold variation in the number of genes needed to specify different amphibians.” [Lewin, Genes II]

un said...

This is another brief response to Mattick's claims from Mike White: Having your cake and eating it: more arguments over human genome function.

It's also interesting to note that back in 2007, Mattick believed that at least 20% of the genome is functional, which is, as Ryan Gregory showed, consistent with the view of many geneticists and molecular biologists (including Comings himself). Even Dan Graur in his lecture at the SMBE a few weeks ago guesstimated that junk DNA comprises at least 65% of the human genome (nowhere near the upper limit that Mattick set for himself). So, what's wrong now? If he truly believed back in 2007 (3 years after he published his now famous article in Scientific American) that it is reasonable to say that 20% of the genome is functional, why is he acting now as though he's revolting against an orthodoxy? And why is he defending the mistakes and the media hype of the ENCODE project?

Speaking of the ENCODE project, Dan Graur has been digging through one of their papers, and he isn't particularly happy about it: #ENCODE, Ewan Birney, Michael Snyder, and Mark Gerstein strike out.

un said...

I also find it strange that he cited Michael Behe in this particular context. I have never seen Behe explicitly speaking about junk DNA before. In fact, his more recent writings seem to suggest that he accepts the fact that the genome seems to be littered with excess DNA.

This is what he had to say:

"If DNA were exactly like a blueprint, with no wasted space, and every line and curve representing a point of building, then this mutation rate would be fatal. After all, one critical mistake is all it takes to kill (or cause the building to collapse). But in fact, DNA isn't exactly like a blueprint. Only a fraction of its sections are directly involved in creating proteins and building life. Most of it seems to be excess DNA, where mutations can occur harmlessly." (Emphasis added)

~ Michael Behe, The Edge of Evolution, P. 66

John Harshman said...

Let's dial it back a teeny bit. Mattick isn't citing Behe and Wells for any purpose other than to show that IDiots are doing what he says they're doing: using the supposed functionality of all DNA as an argument for ID, and against the argument that junk DNA shows ID wrong. He's not claiming the IDiots are correct in their arguments for ID. He hasn't offered any support for ID. It's all just part of his attempt to psychoanalyze his opponents.

Diogenes said...

All right all right, if that was Mattick's purpose than I apologize for calling him a charlatan.

However, the more I look at this paper, the more I see Mattick's rule here is:

What's not self-contradictory in this paper is circular logic.

Georgi Marinov said...

As I said in a previous thread, he is not directly supporting ID, that is true. And all he is technically doing is accusing people who defend junk DNA of doing so out of (anti)religious motivations. However, this is the first article ever by a reputable scientist in a reputable journal that talks about ID so prominently and does not say anything negative about it, which on its own is problematic, and more importantly, if you are willing to make the accusation that the concept of junk DNA is maintained in the scientific community by atheistic bias, you should be prepared for the counterargument, which is that you are supporting the indefensible scientifically notion that the whole genome is functional because you yourself have some nefarious religious agenda, even if you have not yet come out of the closet with it.

psbraterman said...

And while we're at it, I wish journal reviewers demanded from their contributors the same transparency of sentence construction that Moran gives us. Then it might be a lot clearer exactly what Mattock's reasoning consists of.

Larry Moran said...

@John Harshman,

Mattick and Dinger accuse some scientists of arguing against junk DNA because they are motivated by their desire to refute intelligent design. According to Mattick, their argument is that an intelligent designer would never put junk in our genome.

Mattick and Dinger then say ...

This argument is threatened in the face of growing functional indices of noncoding regions of the genome, with the latter reciprocally used in support of the notion of intelligent design and to challenge the conception that natural selection accounts for the existence of complex organisms.

Why would they say that? Why not just say Graur and others are wrong because the scientific evidence supports function and not junk? Why did they choose to ally themselves with intelligent design?

And why did they conflate noncoding DNA and junk DNA and use the term "blind Darwinian evolution." Those are things that the IDiots do routinely. It's safe to assume that Mattick and Dinger are familiar with the debate in the blogosphere since they reference Jack Scanlan's blog post. They must know what they are doing.

I think it's fair to psychoanalyze Mattick and Dinger since they raised the issue.

Oh, and don't forget that they said this in their abstract ...

Finally, we suggest that resistance to these findings is further motivated in some quarters by the use of the dubious concept of junk DNA as evidence against intelligent design.

Georgi Marinov said...

That last sentence, if all one has to read is the abstract, is indeed almost an endorsement of ID.

John Harshman said...

"Almost" in the sense of "not"? You're reading way too much into it, but I'd be willing to change my opinion if the actual paper offered something more. As it stands, this is merely an attack on the motives of his opponents, nothing more.

Georgi Marinov said...

I said "if all one has to read is the abstract".

If you read the paper, you cannot impute nefarious motivation to him if you are to stick to intellectual honesty. But we all know that the set of people who also read the whole paper if a subset (often not very large) of those who read the abstract, so this will have a negative effect just based on this.

John Harshman said...

You're probably right. The IDiots will probably start citing it. The fact that they would have to misinterpret even the abstract to do so won't stop them. Bad writing invites misinterpretation.

Diogenes said...

Larry says: "I think it's fair to psychoanalyze Mattick and Dinger since they raised the issue."

They certainly did raise the issue, and that's to their everlasting shame-- it drops them out below the bottom of the worst of the scientific literature-- but it doesn't mean we should go there.

Nor do we need to! The logic here, the factual errors, are so bad that we don't need to question their motivations! If we get to questioning their motivations, we'll be pushed off what should be the topic:

1. their use of circular logic (while accusing others of the same),

2. their pathetic armchair psychoanalysis in place of evidence,

3. their factual errors,

4. their fallacy of affirming the consequent in spite of Dan Graur's warning not to,

5. their claim that UPWARD exceptions can't disprove the Dog's Ass Plot, and their claim that DOWNWARD exceptions don't exist!!

This is so terrible, we should not be pushed off topic into speculating about their religious beliefs. Sometimes people are pig-ignorant of science for secular reasons also.

Diogenes said...

I'm going to mention some more "Downward exceptions" which according to Mattick don't exist!!


The black-chinned hummingbird, Archilochus alexandri, can fly and hover and has asymmetric flight feathers and an awesome sense of balance, and has 1/4 [26%] as much DNA as a human. This is about the same size as the genome of the sea urchin.

Turkey and chicken have about 1/3 as much DNA as a human.


The bent-winged bat, Miniopterus schreibersi, can fly and has echolocation, but has less than 1/2 [49.4%] as much DNA as a human.

The barking deer Muntiacus muntjak has half as much DNA as a human.


All salamanders, newts, axolotl, caecilians and waterdogs have much, much more DNA than humans, but frogs and toads vary widely.

The ornate burrowing frog, Limnodynastes ornatus, has nearly 1/4 [27%] as much DNA as a human. This is about the same size as the genome of the sea urchin.

Couch's spadefoot toad, Scaphiopus couchii, has less than 1/3 [29%] as much DNA as a human. This is about the same size as chicken or turkey.

The Jamaican laughing frog, Osteopilus brunneus, has about 1/2 [52%] as much DNA as a human. This is more than the bent-winged bat, and about as much as the barking deer.

Within the tree frog genus Hyla there is a four-fold variation, from Hyla nana which has about half [55%] as much DNA as a human, to Hyla cf. lanciformis sp.2 which has 2.12 times as much. (Hyla versicolor has even more but it is apparently tetraploid.)

Genus Xenopus varies from 86% to 2.27 times as much as human.

How can Mattick POSSIBLY claim there are no "downward exceptions"!?

Diogenes said...

I am re-posting this from Mike White's blog:

The main self-contradictory part of Mattick's argument is where he says that repetitive DNA sequences [e.g. transposons] DON'T add to the complexity of the genome, in order to wave away the C value paradox. OTOH, he also says they're functional, all DNA is functional (!) and all of it adds to "developmental complexity", but all DNA does that without adding to "genetic complexity." Go figure!

In this way he explains the supposed superior complexity of the human as compared with, say, the legless salamander Necturus lewisi which has 34.5 times as much DNA as a human (I know, I know, it's ridiculous but that's his logic.)

I can't believe he would write a paragraph like this:

Mattick: "That may be so, but the extent of such baggage in humans is unknown. However, where data is available, these upward exceptions appear to be due to polyploidy and/or varying transposon loads (of uncertain biological relevance), rather than an absolute increase in genetic complexity"

So he just admitted that HALF THE HUMAN GENOME, HALF OF IT!! does not increase genetic complexity! Yes, you can say that HALF of the human genome does not add to genetic complexity, but you are not allowed to say it does not add to function!

He does not describe his metric for genetic complexity. If he means Kolmogorov complexity, he's right-- repetitive sequences add very little to Kolmogorov or algorithmic complexity-- but while that's true, it appears to contradict his assertion that the whole genome is functional, and all of it adds to "developmental complexity."

But I want to discuss that, besides being self-contradictory, it's gobbledygook in terms of Mattick's own allegedly relevant measure of complexity, so-called "developmental complexity" which he does not define a metric for! But supposedly we humans are superior to all other organisms by a metric Mattick won't define! Like here:

Mattick: "Moreover, there is a broadly consistent rise in the amount of non-protein-coding intergenic and intronic DNA with developmental complexity"

Mattick doesn't define "developmental complexity" and he CANNOT without looking like a fool, and sucking himself into a whirlpool of self-contradiction! Consider the following contradictory facts:

All salamanders, newts, axolotls, caecilians and waterdogs have far larger genomes than human beings, SFAIK. That's the rule, not the exception. Are they more developmentally complex than humans?

Suppose Mattick were to argue that indeed, all salamanders are indeed more developmentally complex than humans. He'd then have several huge problems.

1. Axolotols don't fully develop into the adult form of salamanders-- they keep their gills-- they are certainly less "developmentally complex" than humans and other salamanders, but the axolotl Ambystoma mexicanum has 13.7 times as much DNA as a human! SFAIK, this is true of all axolotls studied!

2. Caecilians are legless like snakes, but are amphibians. They never grow legs, and are certainly less "developmentally complex" than humans, but the legless caecilian Siphonops annulatus has 4 times as much DNA as a human.

3. The two-toed amphiuma, Amphiuma means, aka "Conger eel", like an axolotl, never undergoes full development, has small vestigial legs, no eyelids, and no tongue. But it has 27.4 times as much DNA as a human. According to Mattick's logic, it is more "developmentally complex" than a human.

Mattick cannot dismiss these facts as "flukes" or exceptions to his imaginary "rule" that he faked in his Dog's Ass Plot. Again I repeat: All salamanders, newts, axolotls, caecilians and waterdogs have far larger genomes than human beings, SFAIK.

To be continued, with more rules.

Diogenes said...

Continuing, with more rules that are inconvenient for Mattick:

4. Another rule: All lungfishes have more DNA than humans. The African marbled lungfish Protopterus aethiopicus has 38 times more DNA than a human. Are all lungfish more "developmentally complex" than humans?

5. Another rule: Marsupials on AVERAGE have 22% more DNA than the average placental mammal. The AVERAGE for marsupials is 16% higher than human genome size. Bennett's wallaby, for one, has 60% more DNA than a human. That's not a fluke, that's the rule. Are marsupials more "developmentally complex" than humans?

Those are the rules. Those are not the exceptions.

There are also many frogs, many sharks, many crustaceans, some insects, some annelid worms, many flatworms and many plants with more DNA than humans.

Diogenes said...

Let's do some simple envelope calculations of percent non-coding DNA, and see if Mattick is right about "no downward exceptions."

It's easy to look up genome sizes, but I can't find many counts of coding base pairs for all species.

Here are three where I at least know the gene counts.

The sea urchin has 814 Mbp and 23.3K genes.

The pufferfish has 330 Mbp and 27K genes.

The human has 3200 Mbp and 23K genes.

So we guess that each gene has, say, 1500 bp's. In the ballpark.

Then we compute the percent of non-coding DNA:

((genome size in bp)-(# of genes)*1500))/(genome size in Bp)

Sea Urchin: 95.7%

Puffer fish: 87.7%

Human: 98.9%

Is a sea urchin more developmentally complex than a pufferfish?

Mattick says there are no "downward exceptions" to his rules. Is that true?

Within many genera there is a vast variation in genome size, but no real variation in developmental complexity.

Within onions there is a 9.8-fold variation. The difference between the largest onion genome, minus the smallest, is 19.1 times larger than the whole human genome.

Within Necturus, the "waterdogs" [amphibians], there is a 5-fold variation over the genus. The difference between the largest and the smallest is 27.5 times larger than the whole human genome.

Within Genlisea-Utricularia [bladderworts, a flowering plant] there is a 24-fold variation in genome size.

Within Hyla, a genus of tree frogs, there is a 4-fold variation.

Within Xenopus, another frog genus, there is a 2.7-fold variation.

Within the genus Ctenomys, the tuco-tuco [a rodent] there is a 1.75-fold variation.

Among amphibians, there is at least a 100-fold variation. Among angiosperms, there is a 2000-fold variation.

Certainly everything in a genus should have the same number of genes; thus the percentage of non-coding DNA must vary enormously over these genera.

Certainly everything in a genus should have the same developmental complexity. How can Mattick say that developmental complexity scales with percent of non-coding DNA, and that there are no "downward exceptions"?

John Harshman said...

Whether there are downward exceptions depends entirely on what you use for your standard. Has Mattick established a standard? If you pick as your standard the species with the smallest genome in each group, then there will be no downward exceptions. Of course that will increase upward exceptions. I'm wondering what the standard vertebrate might be. If it's fugu, perhaps there are no downward exceptions. But it appears to be, judging by the figure, a doggie. So that's a problem for him. I forget why humans aren't vertebrates and why vertebrates aren't chordates. But never mind.

Claudiu Bandea said...

I think that most scientists are rarely ignorant, but they associate and promote certain ideas or concepts (even when they know that they are misleading), because of potential rewards, such as advancing their careers. The case of ENCODE project is a good example.

Anonymous said...

So,this means Larry M. wins the "junk" debate for now?

SPARC said...

IMO Mattick's use of the term "polyploidy" is wrong. I would rather use "genome duplication" instead. While polyploidy is the starting point for genome duplication it implies that chromosomes numbers remain constant after duplication events. Genome duplication, IMO, leaves room for later rearrangements (translocations, inversions, deletions, additional gene duplications) that have happened to form the chromosomes of living species many of which are diploid despite some polyploid state of a common ancestor.

SPARC said...

John Harshman

The IDiots will probably start citing it.

Every IDiot of the DI is too occupied with Hedin/BSU issues in the moment. But wait another few days ...

The whole truth said...

What do you guys and gals think of this?

SPARC said...

One may wonder if Mattick and Dinger are aware of how genome size is evaluted. How can they talk about polyploidy when c-values refer to the haploid genome? i.e., it corrects for ploidy. From there abstract:

We also show that polyploidy accounts for the higher than expected genome sizes in some eukaryotes, compounded by variable levels of repetitive sequences of unknown significance.

Otherwise one would have to conclude that the knowingly misrepresent the C-value paradox.

SPARC said...

I don't have access to the full Wong et al. article in the moment but I my impression is that it rather rather states that mRNAs containing retained introns are directed towards NMD. Thus, it is likely that at least many of such RNAs are non-fuctional. The hype about the impact of splice variants and the size of databases of alternatvely spliced transcript was dubious before though.

Anonymous said...

Y'all's fits validate what the Peer reviewed work by these researchers assertions are, in that you really could care less about scientific research and more about protecting a preferred ideology. Do some current peer review work to back your assertions, and maybe the rest of us will take you seriously...

The whole truth said...

Misc, do you have a preferred ideology? If so, what is it?

The whole truth said...

And who are "the rest of us"?

Diogenes said...

Harshman asks: "Whether there are downward exceptions depends entirely on what you use for your standard. Has Mattick established a standard?"

In the Dog's Ass Plot, the y axis is percent of non-coding DNA. But within single genera, there are huge variations in genome sizes, so considering that different species within the same genus must have comparable numbers of genes, and identical "complexity", there must be huge variations in percentages of non-coding DNA, which do not track with "developmental complexity."

Diogenes said...

Misc says: "Do some current peer review work to back your assertions"

The Mattick article contains no original research. That which is asserted without doing original research can be refuted without doing original research.

Mattick's article is the equivalent of a letter to the editor-- it's his opinion, no new research there-- but we know his logic is self-contradictory, when it isn't circular, and we know he's factually wrong.

When he says there are "no downward exceptions", that's factually wrong. It's not backed up his orginal research, and we don't need to do original research to refute it. Just read or download T. Ryan Gregory's database of animal genome sizes.

John Harshman said...

That doesn't seem to be responding the the question. But never mind, it was rhetorical.

Anonymous said...

It's funny how this blog repeatedly references a 10 year old figure rendered by the illustrators of Scientific American for the greater public audience. How about reading an entire peer-reviewed scientific journal publication, like this one by Liu, Mattick and Taft that provides substantial evidence that ncDNA, and the ncRNAs encoded within it, may be intimately involved in the evolution, maintenance and development of complex life .

un said...

The following quote is taken from the abstract of the paper that you cited:

"We have previously argued that the proportion of an animal genome that is non-protein-coding DNA (ncDNA) correlates well with its apparent biological complexity."

This claim has been shown to be factually incorrect, in this thread and in many others. And the graph that you're trying to distance Mattick from is actually an accurate reflection of the meaning of the statement quoted above.

In any case, you haven't addressed any of the arguments that Larry and others have raised against Mattick's central claims. In fact, if Mattick is an honest scientist, and I believe he is, he should admit that the many "downward exceptions" that have been provided in this thread do actually falsify his claims.

Anonymous said...

There are always exceptions in biology; so its the way of nature. In a much more refined and detailed approach than the approximations of this thread, the authors also reports: we extended our prior work to the 1,627 prokaryotic and 153 eukaryotic genomes described above and found a clear correlation between the nc/tg ratio and increasing complex taxonomic groups (p < 2.2e-1.6, Kruskal-Wallis test, Fig. 2A). The range of nc/tg values is considerable, with the averages for archaea and bacteria being nearly identical (two-tailed p = 0.359, Mann-Whitney U test) at 0.130 and 0.136, respectively, and extending to ~0.98 in the Metazoa. The average value for each taxa is minimally influenced by data points outside the first or third quartiles. For example, [...]
And further states: To further refine the association of nc/tg ratio values and organismal complexity, we investigated the 73 species with a previously defined number of cell types.35 Examining these species revealed a positive correlation between the nc/tg ratio and organismal complexity (Fig. 2B, Spearman correlation coefficient r = 0.952, p value < 0.0001). We found that the distribution of values was well described by a modified Hill’s equation56 (which is itself a modified logistic function, see “Discussion”), in the form y = Kxn/(1+Kxn) where K = 0.15219 ± 0.02272 with a p value < 0.0001 and n = 0.99888 ± 0.06943 with a p value < 0.0001 (Fig. 2B). This distribution is consistent with patterns observed in complex information systems theory, in which the amount of encoded information approaches an asymptote defined by the maximum allowable entropy (see “Discussion”).

I suspect that by "downward exceptions", what is implied is a downward trend in the nc/tg ratios during evolution.

Diogenes said...

If that's what Mattick meant by "downward exceptions" then his claims have been falsified.

This article is behind a pay wall. Noncodarnia, please copy and paste the values of nc and tg and celltype counts for the 73 species named. It might be in the Supplemental materials. Let's make our own plot and see if it matches the dog's ass plot.

We should also note that correlation does not prove causation. More complex organisms have smaller population sizes, hence more slightly deleterious mutations, hence more non-coding DNA. So this correlation does not get close to proving all ncDNA adds to biological complexity.

But we know the rules: all salamanders, newts, caecilians, axolotls etc. have much more ncDNA than humans. All lungfish have more ncDNA than humans. Marsupials on average have more DNA than placentals incl. humans. Sharks have more ncDNA than bony fish. So let's make our own plot and see if we can discern the real rules.

Thanks in advance for your assistance. You copy and paste nc, tg values and celltype counts, I'll handle statistical analysis.

Diogenes said...

Noncodarnia says there are always exceptions in biology. Mattick just said there are none. Noncodarnia should therefore regard Mattick as incompetent. I doubt such blatant contradictions will trouble him.

And another question: why in this paper did Mattick only analyze ANIMALS? No plants, no fungi allowed. They've been EXPELLED.

Inconvenient facts: angiosperm genomes vary by 2000-fold.

I have two questions for you:

Is Paris japonica a downward exception?

Is Utricularia gibba?

I think you'll change the subject rather than answer. Mattick is cherry picking his data, therefore he did not pick cherries; they would be a most unsuitable data point.

un said...

Adding to what Diogenes has just said, the following is quoted from Graur's critique of the ENCODE media hype:

"Actually, evolution can only produce a genome devoid of “junk” if and only if the effective population size is huge and the deleterious effects of increasing genome size are considerable (Lynch 2007). In the vast majority of known bacterial species, these two conditions are met; selection against excess genome is extremely efficient due to enormous effective population sizes, and the fact that replication time and, hence, generation time are correlated with genome size. In humans, there seems to be no selection against excess genomic baggage. Our effective population size is pitiful and DNA replication does not correlate with genome size."

I'm afraid that the inclusion of prokaryotic genomes in the first analysis that you quoted might have skewed the results.

The whole truth said...

You guys and gals may find it interesting that the Monarch Butterfly (Danaus plexippus) has a genome size of 0.29pg (picograms) while the Least-Marked Euchlaena Moth (Euchlaena irraria) has a genome size of 1.94pg.

From this page:

Of course 'complexity' can be argued endlessly and a Monarch butterfly isn't exactly the same critter as a Euchlaena Moth. They each have their own attributes. The Monarch stands out with its migration and wintering behavior, while the moth may be able to hear the sonar signals of bats and avoid being eaten. I say "may" because I'm not familiar with that particular moth. Still, I'm a bit surprised that the Monarch genome is smaller than that of a Euchlaena Moth and all the other assayed leps (59 total), according to that page.

Diogenes said...

Interesting. Above I asked whether Mattick had cherry picked his data by not picking cherries. For the record, the common cherry Prunus avium has 1/10th the DNA of a human, but within the genus Prunus there is a 13-fold variation.

The largest species is bigger than the human genome, and the difference between largest and smallest is 3.37 pg, almost as big as the whole human genome.

Response from panfuctionists: crickets.

Diogenes said...

Shadi, you are right about his large number of prokaryotes, but I would like to know the 73 species used by Mattick to compute the correlation with cell type counts-- perhaps the subset of 73 is less dominated by prokaryotes.

I asked Nocodarnia to paste Mattick's data and I'm still waiting.

But if there's a bias toward prokaryotes, it's part of a larger problem of bias: Mattick limits himself to sequenced genomes, and so that database is highly biased against large genomes, because they're expensive to sequence.

So with this standard, he's generally examining the near smallest genomes in each group, humans excepted. This bias might be why humans stand out-- an artifact of our human interest in sequencing our own bloated genome, but not the Neuse River waterdog's.

Diogenes said...

And on the topic of crickets (since crickets are all we hear when we ask the creationists or Mattick and coworkers to explain the C value paradox), the families Gryllacrididae and Gryllidae have genome sizes varying by 6-fold among them. The camel cricket has 2.7 times more DNA than human. The difference between the largest and the smallest is more than twice the size of the whole human genome.

But you saw that coming.

SPARC said...

Here's the species list from the supplements (I have to split it due to maximum comment length issues):

I omit 111 Archaea and 1516 Bacteria species.

Hansenula polymorpha NCYC 495 leu1.1 Dikarya:Ascomycota
Encephalitozoon cuniculi Microsporidia
Saccharomyces cerevisiae Dikarya:Ascomycota
Debaryomyces hansenii Dikarya:Ascomycota
Kluyveromyces lactis Dikarya:Ascomycota
Wallemia sebi Dikarya:Basidiomycota
Candida glabrata Dikarya:Ascomycota
Ustilago maydis Dikarya:Basidiomycota
Schizosaccharomyces pombe Dikarya:Ascomycota
Cryptococcus neoformans Dikarya:Basidiomycota
Pichia stipitis Dikarya:Ascomycota
Aspergillus niger Dikarya:Ascomycota
Aspergillus terreus Dikarya:Ascomycota
Rhodotorula graminis strain WP1 Dikarya:Basidiomycota
Stagonospora nodorum Dikarya:Ascomycota
Dothistroma septosporum NZE10 Dikarya:Ascomycota
Aspergillus nidulans Dikarya:Ascomycota
Aspergillus fumigatus A1163 Dikarya:Ascomycota
Coprinopsis cinerea Dikarya:Basidiomycota
Aspergillus fumigatus Af293 Dikarya:Ascomycota
Batrachochytrium dendrobatidis JAM81 Chytridiomycota
Aspergillus clavatus Dikarya:Ascomycota
Fusarium graminearum Dikarya:Ascomycota
Chaetomium globosum Dikarya:Ascomycota
Septoria musiva SO2202 Dikarya:Ascomycota
Yarrowia lipolytica Dikarya:Ascomycota
Neosartorya fischeri Dikarya:Ascomycota
Agaricus bisporus var. burnettii JB137-S8 Dikarya:Basidiomycota
Heterobasidion annosum Dikarya:Basidiomycota
Schizophyllum commune Dikarya:Basidiomycota
Trichoderma virens Gv29-8 Dikarya:Ascomycota
Pleurotus ostreatus PC15 Dikarya:Basidiomycota
Trichoderma atroviride Dikarya:Ascomycota
Sporobolomyces roseus Dikarya:Basidiomycota
Aspergillus carbonarius ITEM 5010 Dikarya:Ascomycota
Agaricus bisporus var bisporus (H97) Dikarya:Basidiomycota
Aspergillus oryzae Dikarya:Ascomycota
Aspergillus flavus Dikarya:Ascomycota
Tremella mesenterica Fries Dikarya:Basidiomycota
Gloeophyllum trabeum Dikarya:Basidiomycota
Ceriporiopsis subvermispora B Dikarya:Basidiomycota
Trichoderma reesei Dikarya:Ascomycota
Cochliobolus heterostrophus C5 Dikarya:Ascomycota
Phanerochaete chrysosporium Dikarya:Basidiomycota
Magnaporthe grisea Dikarya:Ascomycota
Neurospora discreta FGSC 8579 mat A Dikarya:Ascomycota
Neurospora tetrasperma FGSC 2508 mat A Dikarya:Ascomycota
Thielavia terrestris Dikarya:Ascomycota
Neurospora crassa Dikarya:Ascomycota
Mucor circinelloides CBS277.49 Fungi incertae sedis
Mycosphaerella graminicola Dikarya:Ascomycota
Laccaria bicolor Dikarya:Basidiomycota
Cryphonectria parasitica EP155 Dikarya:Ascomycota
Sporotrichum thermophile Dikarya:Ascomycota
Phycomyces blakesleeanus NRRL1555 Fungi incertae sedis
Serpula lacrymans S7.3 Dikarya:Basidiomycota
Puccinia graminis f. sp. tritici Dikarya:Basidiomycota
Postia placenta MAD-698 Dikarya:Basidiomycota
Mycosphaerella fijiensis Dikarya:Ascomycota
Melampsora laricis-populina Dikarya:Basidiomycota

SPARC said...

Paramecium tetraurelia Alveolata
Theileria annulata Alveolata
Cryptosporidium parvum Iowa type II Alveolata
Dictyostelium discoideum Stramenopiles
Plasmodium yoelii Alveolata
Naegleria gruberi Heterolobosea
Plasmodium chabaudi Alveolata
Plasmodium berghei Alveolata
Entamoeba histolytica Amoebozoa
Dictyostelium purpureum QSDP1 Amoebozoa
Thalassiosira pseudonana Stramenopiles
Plasmodium falciparum Alveolata
Phaeodactylum tricornutum Alveolata
Trypanosoma brucei Euglenozoa
Plasmodium vivax Alveolata
Leishmania major Euglenozoa
Tetrahymena thermophila Amoebozoa
Plasmodium knowlesi Alveolata
Monosiga brevicollis Choanofiagellida
Phytophthora ramorum Heterokontophyta
Phytophthora capsici LT1534 Stramenopiles
Trypanosoma cruzi Euglenozoa
Emiliania huxleyi CCMP1516 Haptophyceae

Ostreococcus tauri Chlorophyta
Ostreococcus lucimarinus Chlorophyta
Ostreococcus sp. RCC809 Chlorophyta
Micromonas pusilla CCMP1545 Chlorophyta
Chlorella sp. NC64A Chlorophyta
Coccomyxa sp. C-169 Chlorophyta
Arabidopsis thaliana Streptophyta
Chlamydomonas reinhardtii Chlorophyta
Cucumis sativus Streptophyta
Selaginella moellendorffii Streptophyta
Volvox carteri f. nagariensis Chlorophyta
Arabidopsis lyrata Streptophyta
Oryza sativa Streptophyta
Brachypodium distachyon Streptophyta
Populus trichocarpa Streptophyta
Mimulus guttatus v1.0 Streptophyta
Physcomitrella patens subsp patens Streptophyta
Vitis vinifera Streptophyta
Sorghum bicolor Streptophyta
Zea mays Streptophyta

Caenorhabditis remanei Nematoda
Caenorhabditis elegans Nematoda
Caenorhabditis briggsae Nematoda
Tetranychus urticae Arthropoda
Amphimedon queenslandica Porifera
Caenorhabditis brenneri Nematoda
Caenorhabditis japonica Nematoda
Daphnia pulex Arthropoda
Pristionchus pacificus Nematoda
Trichoplax adhaerens Grell-BS-1999 Placozoa
Drosophila erecta Arthropoda
Drosophila melanogaster Arthropoda
Helobdella robusta Annelida
Drosophila grimshawi Arthropoda
Capitella teleta Annelida
Tetraodon nigroviridis Vertebrata
Ciona intestinalis Chordata
Drosophila ananassae Arthropoda
Takifugu rubripes Vertebrata
Branchiostoma floridae Chordata
Ciona savignyi Chordata
Anopheles gambiae Arthropoda
Nematostella vectensis Cnidaria
Lottia gigantea Mollusca
Acyrthosiphon pisum Arthropoda
Gasterosteus aculeatus Vertebrata
Aedes aegypti Arthropoda
Oryzias latipes Vertebrata
Culex quinquefasciatus Arthropoda
Bombyx mori Arthropoda
Danio rerio Vertebrata
Gallus gallus Vertebrata
Xenopus tropicalis Vertebrata
Taeniopygia guttata Vertebrata
Anolis carolinensis Vertebrata
Felis catus Mammalia
Ailuropoda melanoleuca Mammalia
Mus musculus Mammalia
Rattus norvegicus Mammalia
Canis familiaris Mammalia
Homo sapiens Mammalia
Ornithorhynchus anatinus Mammalia
Bos taurus Mammalia
Cavia porcellus Mammalia
Gorilla gorilla Mammalia
Pan troglodytes Mammalia
Sus scrofa Mammalia
Pongo abelii Mammalia
Monodelphis domestica Mammalia
Oryctolagus cuniculus Mammalia

SPARC said...

More interstingly, Mattick defines biological complexity on the basis of cell numbers:

Species Phylogenetic Group complexity (mean number of different cell types)
Mycoplasma genitalium G37 bacteria 1
Mycoplasma pneumoniae M129 bacteria 1
Bartonella bacilliformis KC583 bacteria 1
Helicobacter pylori bacteria 1
Streptococcus pyogenes M2, MGAS10270 bacteria 1
Neisseria meningitidis FAM18 bacteria 1
Actinobacillus succinogenes 130Z bacteria 1
Clostridium tetani Massachusetts E88 bacteria 1
Staphylococcus epidermidis ATCC 12228 bacteria 1
Lactobacillus plantarum JDM1 bacteria 1
Brucella abortus S19 bacteria 1
Clostridium botulinum type A - Hall bacteria 1
Mycobacterium tuberculosis H37Rv (lab strain) bacteria 1
Yersinia enterocolitica enterocolitica 8081 bacteria 1
Escherichia coli DH10B bacteria 1
Salmonella enterica enterica sv Typhimurium LT2 LT2 bacteria 1
Paracoccus denitrificans PD1222 bacteria 1
Bacillus anthracis CDC 684 bacteria 1
Pseudomonas aeruginosa PAO1 bacteria 1
Ensifer medicae WSM419 bacteria 1
Cupriavidus necator JMP134 bacteria 1
Myxococcus xanthus DK 1622 bacteria 1
Streptomyces coelicolor A3(2) bacteria 1
Dictyostelium discoideum protozoa 4,65
Trypanosoma brucei protozoa 7,85
Leishmania major protozoa 7,85
Phytophthora ramorum protozoa 7,85
Entamoeba histolytica protozoa 4,65
Theileria annulata protozoa 7,85
Plasmodium falciparum protozoa 7,85
Neurospora crassa fungi 5,55
Aspergillus nidulans fungi 5,55
Schizosaccharomyces pombe fungi 4,35
Saccharomyces cerevisiae fungi 3,05
Kluyveromyces lactis fungi 3,05
Yarrowia lipolytica fungi 3,05
Encephalitozoon cuniculi fungi 3,35
Phanerochaete chrysosporium fungi 4,35
Ustilago maydis fungi 4,35
Chlamydomonas reinhardtii plant 12,5
Micromonas pusilla CCMP1545 plant 12,5
Ostreococcus lucimarinus plant 12,5
Ostreococcus tauri plant 12,5
Volvox carteri f. nagariensis plant 14,5
Physcomitrella patens subsp patens plant 22
Selaginella moellendorffii plant 25
Brachypodium distachyon plant 27,25
Oryza sativa plant 27,25
Sorghum bicolor plant 27,25
Arabidopsis thaliana plant 27,25
Vitis vinifera plant 27,25
Populus trichocarpa plant 28,5
Amphimedon queenslandica protostomia 16
Nematostella vectensis protostomia 22
Caenorhabditis elegans protostomia 28,5
Daphnia pulex protostomia 50
Anopheles gambiae protostomia 64
Drosophila melanogaster protostomia 64
Ciona intestinalis deuterostomia 74
Branchiostoma floridae deuterostomia 100
Danio rerio deuterostomia 119,5
Tetraodon nigroviridis deuterostomia 119,5
Takifugu rubripes deuterostomia 119,5
Xenopus tropicalis deuterostomia 129,5
Anolis carolinensis deuterostomia 140
Gallus gallus deuterostomia 154
Felis catus deuterostomia 159
Canis familiaris deuterostomia 159
Bos taurus deuterostomia 159
Rattus norvegicus deuterostomia 159
Mus musculus deuterostomia 159
Pan troglodytes deuterostomia 169
Homo sapiens deuterostomia 169

SPARC said...

Mattick cites G. Wray who want to solve the c-value paradoc by explaining the g-value paradox by the i-value.

Diogenes said...

SPARC, you're the hero of the day. Thank you very much for stepping up to the plate when Noncodarnia (who I think might be a student of Mattick's) ran off.

The cell type counts are great, but might I trouble you to copy the coding nucleotide counts and genome sizes for the 73 species that Mattick analyzes? Not all of them, just the 73 for which he computed the correlation.

SPARC said...

just send your e-mail address to

Diogenes said...

Done and done.

SPARC said...

I am wondering if Mattick's cell type counts are correct. I am not aware that Hominidae differ in chroosome numbers from other mammals but I may be wrong.

SPARC said...

My comment on Mattick and Dinger at HUGO didn't show up in a week.