More Recent Comments

Tuesday, September 05, 2023

John Mattick's new dog-ass plot (with no dog)

John Mattick is famous for arguing that there's a correlation between genome size and complexity; notably in a 2004 Scientific American article (Mattick, 2004) [Genome Size, Complexity, and the C-Value Paradox ]. That's the article that has the famous Dog-Ass Plot (left) with humans representing the epitome of complexity and genome size. He claims that this correlation is evidence that most of the genomes of complex animals must have a function. He repeats this claim in a recent paper (see below).

Mattick, J.S. (2023) RNA out of the mist. TRENDS in Genetics 39:187-207. [doi: 10.1016/j.tig.2022.11.001,/p>

RNA has long been regarded primarily as the intermediate between genes and proteins. It was a surprise then to discover that eukaryotic genes are mosaics of mRNA sequences interrupted by large tracts of transcribed but untranslated sequences, and that multicellular organisms also express many long ‘intergenic’ and antisense noncoding RNAs (lncRNAs). The identification of small RNAs that regulate mRNA translation and half-life did not disturb the prevailing view that animals and plant genomes are full of evolutionary debris and that their development is mainly supervised by transcription factors. Gathering evidence to the contrary involved addressing the low conservation, expression, and genetic visibility of lncRNAs, demonstrating their cell-specific roles in cell and developmental biology, and their association with chromatin-modifying complexes and phase-separated domains. The emerging picture is that most lncRNAs are the products of genetic loci termed ‘enhancers’, which marshal generic effector proteins to their sites of action to control cell fate decisions during development.

Most of the paper is simply a rehash of the same argument he has been making for 25 years [John Mattick's latest attack on junk DNA] [John Mattick's paradigm shaft]. He claims that most biologists are stuck in an old paradigm where all genes are protein-coding genes. He claims that the recent discovery of noncoding genes has overthrown the old paradigm and led to the realization that most genes in complex organisms are noncoding genes that produce regulatory RNAs.

Recently, he has become aware of several arguments in favor of junk DNA and several arguments that question the functionality of most transcripts. This paper is an attempt to neutralize those agruments but we've heard his excuses before and hardly anyone is buying them. The latest human reference genome annotation lists 19,831 protein-coding genes and 25,959 noncoding genes but very few of those noncoding genes have been shown to be functional. The Caenorhabditis elegans (nematode) reference genome has 24,813 noncoding genes according to Ensembl annotators. Mattick probably doesn't think that a nematode is almost as complex as a human.

Mattick remains convinced that there's a strong correlation between the size of a genome and the complexity of an organism. He believes that the correlation is due to a huge expansion in the number of noncoding regulatory genes. Apparently he thinks that the complexity of humans is due to the fact that the expression of ten thousand conserved housekeeping genes is regulated by ten or twenty thousand regulatory RNAs. Here's his latest plot of complexity (whatever that is) versus the fraction of the genome devoted to noncoding DNA. That fraction includes introns (~40% of the human genome) so I'm guessing that the size of introns has something to do with increasing complexity.

The good news is that dogs are no longer the second-most complex species. That honor now goes to mice.

Ryan Gregory showed us a much better way to display this data back in 2005 (Gregory, 2005). He gave me permission to publish my own version below (Moran, 2023).1

1. I'm aware of the fact that the vertical axis incorrectly implies some sort of progression. I decided to show it this way instead of scrambling the bars because that would be too confusing for the average reader.

Gregory, T.R. (2005) Genome Size Evolution in Animals. The Evolution of the Genome. T. R. Gregory. Elsevier Academic Press, New york, Oxford etc.: 3-87.

Mattick, J.S. (2004) The hidden genetic program of complex organisms. Sci Am. 291:60-67


Anonymous said...

The red vizcacha rat has the largest mammalian genome, which is several times bigger than a humans. I find it "strange" (not really) that Mattick doesn't include it in his dog-ass plot...or explain what makes it more complex than humans.

John Harshman said...

Very odd plot and very cherry-picked. The other vertebrates, in addition to the human and the mouse, are fugu (!) and another I can't read. The only two plants are two varieties of rice. And the sole "invertebrate" is a mosquito. What, one wonders, are the criteria for inclusion?

Donald Forsdyke said...

John Mattick wrote: "I advise my students to take notice of the unusual, the things that cannot be explained by, or do not fit comfortably within, the current conceptual framework." This echoes the "treasure your exceptions" of the geneticist William Bateson. In the paper John sets out the history of the discovery of introns that, at first, seemed exceptional, but was then found to be very general in eukaryotes.

I wrote to John:"Your compatriot, New Zealand born, Darryl Reanney and I were equally puzzled by introns and came up with a solution that differed from yours. I was set on my trail by a friendly information scientist who told me that all forms of information are accompanied by Hamming codes that interrupt the primary information. Darryl veered off-track into deep, but intriguing, philosophy, and died in 1994. By that time, I had gathered much evidence supporting our scheme, which seems to have withstood the test of time.

The tale is related in successive editions of my textbook, Evolutionary Bioinformatics (3rd edition 2016). I suspect future blending of hypotheses will reveal that we all had some part of the truth.

Anonymous said...

I thought Dr. Forsdyke’s books sounded interesting—since I am a computer science guy and the reference to Hamming codes sounded suspiciously like buzzword bingo—but $60 for a Kindle book is beyond the impulse-buy threshold of a layman. Can anyone give me a pointer to some publicly-available treatment (perhaps Dr. Forsdyke himself?)