More Recent Comments

Showing posts sorted by date for query mattick. Sort by relevance Show all posts
Showing posts sorted by date for query mattick. Sort by relevance Show all posts

Saturday, October 13, 2018

The great junk DNA debate


I've been talking to philosophers lately about the true state of the junk DNA controversy. I imagine what it would be like to stage a great debate on the topic. It's easy to come up with names for the pro-junk side; Dan Graur, Ford Doolittle, Sean Eddy, Ryan Gregory etc. It's hard to think of any experts who could defend the idea that most of our genome is functional. The only scientist I can think of who would accept such a challenge is John Mattick but let's imagine that he could find three others to join him in the great debate.

I claim that the debate would be a rout for the pro-junk side. The data and the theories are all on the side of those who would argue that 90% of our genome is junk. I don't think the functionalists could possibly defend the idea that most of our genome is functional. What do you think?

Assuming that I'm right, why is it that the average scientist doesn't know this? Why do they still believe there's a good case for function when none of the arguments stand up to close scrutiny? And why are philosophers not conveying the true state of the controversy to their readers? I'm told that anti-junk philosophers like Evelyn Fox Keller are held in high regard even though her arguments are easy to refute [When philosophers talk about genomes]. I'm told that John Mattick is highly respected in philosophy circles even though knowledgeable scientists have little use for his writings.

Can readers help me identify papers by philosophers of science that come down on the side of junk DNA and conclude that experts like Graur, Doolittle, etc are almost certainly correct?


Image Credit: The cartoon is by Tom Gauld and it was published online at the The New York Times Magazine website. I hope they will consider it fair use on an educational blog. See: Junk DNA comments in the New York Times Magazine.

Saturday, April 07, 2018

Required reading for the junk DNA debate

This is a list of scientific papers on junk DNA that you need to read (and understand) in order to participate in the junk DNA debate. It's not a comprehensive list because it's mostly papers that defend junk DNA and refute arguments for massive amounts of function. The only exception is the paper by Mattick and Dinger (2013).1 It's the only anti-junk paper that attempts to deal with the main evidence for junk DNA. If you know of any other papers that make a good case against junk DNA then I'd be happy to include them in the list.

If you come across a publication that argues against junk DNA, then you should immediately check the reference list. If you do not see some of these references in the list, then don't bother reading the paper because you know the author is not knowledgeable about the subject.

Brenner, S. (1998) Refuge of spandrels. Current Biology, 8:R669-R669. [PDF]

Brunet, T.D., and Doolittle, W.F. (2014) Getting “function” right. Proceedings of the National Academy of Sciences, 111:E3365-E3365. [doi: 10.1073/pnas.1409762111]

Casane, D., Fumey, J., et Laurenti, P. (2015) L’apophénie d’ENCODE ou Pangloss examine le génome humain. Med. Sci. (Paris) 31: 680-686. [doi: 10.1051/medsci/20153106023] [The apophenia of ENCODE or Pangloss looks at the human genome]

Cavalier-Smith, T. (1978) Nuclear volume control by nucleoskeletal DNA, selection for cell volume and cell growth rate, and the solution of the DNA C-value paradox. Journal of Cell Science, 34(1), 247-278. [doi: PDF]

Doolittle, W.F. (2013) Is junk DNA bunk? A critique of ENCODE. Proc. Natl. Acad. Sci. (USA) published online March 11, 2013. [PubMed] [doi: 10.1073/pnas.1221376110]

Doolittle, W.F., Brunet, T.D., Linquist, S., and Gregory, T.R. (2014) Distinguishing between “function” and “effect” in genome biology. Genome biology and evolution 6, 1234-1237. [doi: 10.1093/gbe/evu098]

Doolittle, W.F., and Brunet, T.D. (2017) On causal roles and selected effects: our genome is mostly junk. BMC biology, 15:116. [doi: 10.1186/s12915-017-0460-9]

Eddy, S.R. (2012) The C-value paradox, junk DNA and ENCODE. Current Biology, 22:R898. [doi: 10.1016/j.cub.2012.10.002]

Eddy, S.R. (2013) The ENCODE project: missteps overshadowing a success. Current Biology, 23:R259-R261. [10.1016/j.cub.2013.03.023]

Graur, D. (2017) Rubbish DNA: The functionless fraction of the human genome Evolution of the Human Genome I (pp. 19-60): Springer. [doi: 10.1007/978-4-431-56603-8_2 (book)] [PDF]

Graur, D. (2017) An upper limit on the functional fraction of the human genome. Genome Biology and Evolution, 9:1880-1885. [doi: 10.1093/gbe/evx121]

Graur, D., Zheng, Y., Price, N., Azevedo, R. B., Zufall, R. A., and Elhaik, E. (2013) On the immortality of television sets: "function" in the human genome according to the evolution-free gospel of ENCODE. Genome Biology and Evolution published online: February 20, 2013 [doi: 10.1093/gbe/evt028

Graur, D., Zheng, Y., and Azevedo, R.B. (2015) An evolutionary classification of genomic function. Genome Biology and Evolution, 7:642-645. [doi: 10.1093/gbe/evv021]

Gregory, T. R. (2005) Synergy between sequence and size in large-scale genomics. Nature Reviews Genetics, 6:699-708. [doi: 10.1038/nrg1674]

Haerty, W., and Ponting, C.P. (2014) No Gene in the Genome Makes Sense Except in the Light of Evolution. Annual review of genomics and human genetics, 15:71-92. [doi:10.1146/annurev-genom-090413-025621]

Hurst, L.D. (2013) Open questions: A logic (or lack thereof) of genome organization. BMC biology, 11:58. [doi:10.1186/1741-7007-11-58]

Kellis, M., Wold, B., Snyder, M.P., Bernstein, B.E., Kundaje, A., Marinov, G.K., Ward, L.D., Birney, E., Crawford, G. E., and Dekker, J. (2014) Defining functional DNA elements in the human genome. Proc. Natl. Acad. Sci. (USA) 111:6131-6138. [doi: 10.1073/pnas.1318948111]

Mattick, J. S., and Dinger, M. E. (2013) The extent of functionality in the human genome. The HUGO Journal, 7:2. [doi: 10.1186/1877-6566-7-2]

Five Things You Should Know if You Want to Participate in the Junk DNA DebateMorange, M. (2014) Genome as a Multipurpose Structure Built by Evolution. Perspectives in biology and medicine, 57:162-171. [doi: 10.1353/pbm.2014.000]

Niu, D. K., and Jiang, L. (2012) Can ENCODE tell us how much junk DNA we carry in our genome?. Biochemical and biophysical research communications 430:1340-1343. [doi: 10.1016/j.bbrc.2012.12.074]

Ohno, S. (1972) An argument for the genetic simplicity of man and other mammals. Journal of Human Evolution, 1:651-662. [doi: 10.1016/0047-2484(72)90011-5]

Ohno, S. (1972) So much "junk" in our genome. In H. H. Smith (Ed.), Evolution of genetic systems (Vol. 23, pp. 366-370): Brookhaven symposia in biology.

Palazzo, A.F., and Gregory, T.R. (2014) The Case for Junk DNA. PLoS Genetics, 10:e1004351. [doi: 10.1371/journal.pgen.1004351]

Rands, C. M., Meader, S., Ponting, C. P., and Lunter, G. (2014) 8.2% of the Human Genome Is Constrained: Variation in Rates of Turnover across Functional Element Classes in the Human Lineage. PLOS Genetics, 10:e1004525. [doi: 10.1371/journal.pgen.1004525]

Thomas Jr, C.A. (1971) The genetic organization of chromosomes. Annual review of genetics, 5:237-256. [doi: annurev.ge.05.120171.001321]


1. The paper by Kellis et al. (2014) is ambiguous. It's clear that most of the ENCODE authors are still opposed to junk DNA even though the paper is mostly a retraction of their original claim that 80% of the genome is functional.

Thursday, April 05, 2018

Subhash Lakhotia: The concept of 'junk DNA' becomes junk

Continuing my survey of recent papers on junk DNA, I stumbled upon a review by Subash Lakhotia that has recently been accepted in The Proceedings of the Indian National Science Academy (Lakhotia, 2018). It illustrates the extent of the publicity campaign mounted by ENCODE and opponents of junk DNA. In the title of this post, I paraphrased a sentence from the abstract that summarizes the point of the paper; namely, that the 'recent' discovery of noncoding RNAs refutes the concept of junk DNA.

Lakhotia claims to have written a review of the history of junk DNA but, in fact, his review perpetuates a false history. He repeats a version of history made popular by John Mattick. It goes like this. Old-fashioned scientists were seduced by Crick's central dogma into thinking that the only important part of the genome was the part encoding proteins. They ignored genes for noncoding RNAs because they didn't fit into their 'dogma.' They assumed that most of the noncoding part of the genome was junk. However, recent new discoveries of huge numbers of noncoding RNAs reveal that those scientists were very stupid. We now know that the genome is chock full of noncoding RNA genes and the concept of junk DNA has been refuted.

Tuesday, March 13, 2018

Making Sense of Genes by Kostas Kampourakis

Kostas Kampourakis is a specialist in science education at the University of Geneva, Geneva (Switzerland). Most of his book is an argument against genetic determinism in the style of Richard Lewontin. You should read this book if you are interested in that argument. The best way to describe the main thesis is to quote from the last chapter.

Here is the take-home message of this book: Genes were initially conceived as immaterial factors with heuristic values for research, but along the way they acquired a parallel identity as DNA segments. The two identities never converged completely, and therefore the best we can do so far is to think of genes as DNA segments that encode functional products. There are neither 'genes for' characters nor 'genes for' diseases. Genes do nothing on their own, but are important resources for our self-regulated organism. If we insist in asking what genes do, we can accept that they are implicated in the development of characters and disease, and that they account for variation in characters in particular populations. Beyond that, we should remember that genes are part of an interactive genome that we have just begun to understand, the study of which has various limitations. Genes are not our essences, they do not determine who we are, and they are not the explanation of who we are and what we do. Therefore we are not the prisoners of any genetic fate. This is what the present book has aimed to explain.

Sunday, February 18, 2018

Human genome books

Theme
Genomes
& Junk DNA

I'm trying to read all the recent books on the human genome and anything related. There are a lot of them. Here's a list with some brief comments. You should buy some of these books. There are others you should not buy under any circumstances.

Friday, February 09, 2018

Are splice variants functional or noise?

This is a post about alternative splicing. I've avoided using that term in the title because it's very misleading. Alternative splicing produces a number of different products (RNA or protein) from a single intron-containing gene. The phenomenon has been known for 35 years and there are quite a few very well-studied examples, including several where all of the splice regulatory factors have been characterized.

Tuesday, February 06, 2018

How many lncRNAs are functional?

There's solid evidence that 90% of your genome is junk. Most of it is transcribed at some time but the transcripts are transient and usually confined to the nucleus. They are junk RNA [Functional RNAs?]. This is the view held by many experts but you wouldn't know that from reading the scientific literature and the popular press. The opposition to junk DNA gets much more attention in both venues.

There are prominent voices expressing the view that most of the genome is devoted to producing functional RNAs required for regulating gene expression [John Mattick still claims that most lncRNAs are functional]. Most of these RNAs are long noncoding RNAs known as lncRNAs. Although most of them fail all reasonable criteria for function there are still those who maintain that tens of thousands of them are functional [How many lncRNAs are functional: can sequence comparisons tell us the answer?].

Saturday, February 03, 2018

What's in Your Genome?: Chapter 5: Regulation and Control of Gene Expression

I'm working (slowly) on a book called What's in Your Genome?: 90% of your genome is junk! The first chapter is an introduction to genomes and DNA [What's in Your Genome? Chapter 1: Introducing Genomes ]. Chapter 2 is an overview of the human genome. It's a summary of known functional sequences and known junk DNA [What's in Your Genome? Chapter 2: The Big Picture]. Chapter 3 defines "genes" and describes protein-coding genes and alternative splicing [What's in Your Genome? Chapter 3: What Is a Gene?]. Chapter 4 is all about pervasive transcription and genes for functional noncoding RNAs [What's in Your Genome? Chapter 4: Pervasive Transcription].

Chapter 5 is Regulation and Control of Gene Expression.
Chapter 5: Regulation and Control of Gene Expression

What do we know about regulatory sequences?
The fundamental principles of regulation were worked out in the 1960s and 1970s by studying bacteria and bacteriophage. The initiation of transcription is controlled by activators and repressors that bind to DNA near the 5′ end of a gene. These transcription factors recognize relatively short sequences of DNA (6-10 bp) and their interactions have been well-characterized. Transcriptional regulation in eukaryotes is more complicated for two reasons. First, there are usually more transcription factors and more binding sites per gene. Second, access to binding sites depends of the state of chromatin. Nucleosomes forming high order structures create a "closed" domain where DNA binding sites are not accessible. In "open" domains the DNA is more accessible and transcription factors can bind. The transition between open and closed domains is an important addition to regulating gene expression in eukaryotes.
The limitations of genomics
By their very nature, genomics studies look at the big picture. Such studies can tell us a lot about how many transcription factors bind to DNA and how much of the genome is transcribed. They cannot tell you whether the data actually reflects function. For that, you have to take a more reductionist approach and dissect the roles of individual factors on individual genes. But working on single genes can be misleading ... you may miss the forest for the trees. Genomic studies have the opposite problem, they may see a forest where there are no trees.
Regulation and evolution
Much of what we see in evolution, especially when it comes to phenotypic differences between species, is due to differences in the regulation of shared genes. The idea dates back to the 1930s and the mechanisms were worked out mostly in the 1980s. It's the reason why all complex animals should have roughly the same number of genes—a prediction that was confirmed by sequencing the human genome. This is the field known as evo-devo or evolutionary developmental biology.
           Box 5-1: Can complex evolution evolve by accident?
Slightly harmful mutations can become fixed in a small population. This may cause a gene to be transcribed less frequently. Subsequent mutations that restore transcription may involve the binding of an additional factor to enhance transcription initiation. The result is more complex regulation that wasn't directly selected.
Open and closed chromatin domains
Gene expression in eukaryotes is regulated, in part, by changing the structure of chromatin. Genes in domains where nucleosomes are densely packed into compact structures are essentially invisible. Genes in more open domains are easily transcribed. In some species, the shift between open and closed domains is associated with methylation of DNA and modifications of histones but it's not clear whether these associations cause the shift or are merely a consequence of the shift.
           Box 5-2: X-chromosome inactivation
In females, one of the X-chromosomes is preferentially converted to a heterochromatic state where most of the genes are in closed domains. Consequently, many of the genes on the X chromosome are only expressed from one copy as is the case in males. The partial inactivation of an X-chromosome is mediated by a small regulatory RNA molecule and this inactivated state is passed on to all subsequent descendants of the original cell.
           Box 5-3: Regulating gene expression by
           rearranging the genome

In several cases, the regulation of gene expression is controlled by rearranging the genome to bring a gene under the control of a new promoter region. Such rearrangements also explain some developmental anomalies such as growth of legs on the head fruit flies instead of antennae. They also account for many cancers.
ENCODE does it again
Genomic studies carried out by the ENCODE Consortium reported that a large percentage of the human genome is devoted to regulation. What the studies actually showed is that there are a large number of binding sites for transcription factors. ENCODE did not present good evidence that these sites were functional.
Does regulation explain junk?
The presence of huge numbers of spurious DNA binding sites is perfectly consistent with the view that 90% of our genome is junk. The idea that a large percentage of our genome is devoted to transcriptional regulation is inconsistent with everything we know from the the studies of individual genes.
           Box 5-3: A thought experiment
Ford Doolittle asks us to imagine the following thought experiment. Take the fugu genome, which is very much smaller than the human genome, and the lungfish genome, which is very much larger, and subject them to the same ENCODE analysis that was performed on the human genome. All three genomes have approximately the same number of genes and most of those genes are homologous. Will the number of transcription factor biding sites be similar in all three species or will the number correlate with the size of the genomes and the amount of junk DNA?
Small RNAs—a revolutionary discovery?
Does the human genome contain hundreds of thousands of gene for small non-coding RNAs that are required for the complex regulation of the protein-coding genes?
A “theory” that just won’t die
"... we have refuted the specific claims that most of the observed transcription across the human genome is random and put forward the case over many years that the appearance of a vast layer of RNA-based epigenetic regulation was a necessary prerequisite to the emergence of developmentally and cognitively advanced organisms." (Mattick and Dinger, 2013)
What the heck is epigenetics?
Epigenetics is a confusing term. It refers loosely to the regulation of gene expression by factors other than differences in the DNA. It's generally assumed to cover things like methylation of DNA and modification of histones. Both of these effects can be passed on from one cell to the next following mitosis. That fact has been known for decades. It is not controversial. The controversy is about whether the heritability of epigenetic features plays a significant role in evolution.
           Box 5-5: The Weismann barrier
The Weisman barrier refers to the separation between somatic cells and the germ line in complex multicellular organisms. The "barrier" is the idea that changes (e.g. methylation, histone modification) that occur in somatic cells can be passed on to other somatic cells but in order to affect evolution those changes have to be transferred to the germ line. That's unlikely. It means that Lamarckian evolution is highly improbable in such species.
How should science journalists cover this story?
The question is whether a large part of the human genome is devoted to regulation thus accounting for an unexpectedly large genome. It's an explanation that attempts to refute the evidence for junk DNA. The issue is complex and very few science journalists are sufficiently informed enough to do it justice. They should, however, be making more of an effort to inform themselves about the controversial nature of the claims made by some scientists and they should be telling their readers that the issue has not yet been resolved.


Monday, September 11, 2017

What's in Your Genome?: Chapter 4: Pervasive Transcription (revised)

I'm working (slowly) on a book called What's in Your Genome?: 90% of your genome is junk! The first chapter is an introduction to genomes and DNA [What's in Your Genome? Chapter 1: Introducing Genomes ]. Chapter 2 is an overview of the human genome. It's a summary of known functional sequences and known junk DNA [What's in Your Genome? Chapter 2: The Big Picture]. Chapter 3 defines "genes" and describes protein-coding genes and alternative splicing [What's in Your Genome? Chapter 3: What Is a Gene?].

Chapter 4 is all about pervasive transcription and genes for functional noncoding RNAs. I've finally got a respectable draft of this chapter. This is an updated summary—the first version is at: What's in Your Genome? Chapter 4: Pervasive Transcription.

Friday, August 25, 2017

How much of the human genome is devoted to regulation?

All available evidence suggests that about 90% of our genome is junk DNA. Many scientists are reluctant to accept this evidence—some of them are even unaware of the evidence [Five Things You Should Know if You Want to Participate in the Junk DNA Debate]. Many opponents of junk DNA suffer from what I call The Deflated Ego Problem. They are reluctant to concede that humans have about the same number of genes as all other mammals and only a few more than insects.

One of the common rationalizations is to speculate that while humans may have "only" 25,000 genes they are regulated and controlled in a much more sophisticated manner than the genes in other species. It's this extra level of control that makes humans special. Such speculations have been around for almost fifty years but they have gained in popularity since publication of the human genome sequence.

In some cases, the extra level of regulation is thought to be due to abundant regulatory RNAs. This means there must be tens of thousand of extra genes expressing these regulatory RNAs. John Mattick is the most vocal proponent of this idea and he won an award from the Human Genome Organization for "proving" that his speculation is correct! [John Mattick Wins Chen Award for Distinguished Academic Achievement in Human Genetic and Genomic Research]. Knowledgeable scientists know that Mattick is probably wrong. They believe that most of those transcripts are junk RNAs produced by accidental transcription at very low levels from non-conserved sequences.

Wednesday, June 21, 2017

John Mattick still claims that most lncRNAs are functional

Most of the human genome is transcribed at some time or another in some tissue or another. The phenomenon is now known as pervasive transcription. Scientists have known about it for almost half a century.

At first the phenomenon seemed really puzzling since it was known that coding regions accounted for less than 1% of the genome and genetic load arguments suggested that only a small percentage of the genome could be functional. It was also known that more than half the genome consists of repetitive sequences that we now know are bits and pieces of defective transposons. It seemed unlikely back then that transcripts of defective transposons could be functional.

Part of the problem was solved with the discovery of RNA processing, especially splicing. It soon became apparent (by the early 1980s) that a typical protein coding gene was stretched out over 37,000 bp of which only 1300 bp were coding region. The rest was introns and intron sequences appeared to be mostly junk.

Wednesday, March 08, 2017

What's in Your Genome? Chapter 4: Pervasive Transcription

I'm working (slowly) on a book called What's in Your Genome?: 90% of your genome is junk! The first chapter is an introduction to genomes and DNA [What's in Your Genome? Chapter 1: Introducing Genomes ]. Chapter 2 is an overview of the human genome. It's a summary of known functional sequences and known junk DNA [What's in Your Genome? Chapter 2: The Big Picture]. Chapter 3 defines "genes" and describes protein-coding genes and alternative splicing [What's in Your Genome? Chapter 3: What Is a Gene?].

Chapter 4 is all about pervasive transcription and genes for functional noncoding RNAs.
Chapter 4: Pervasive Transcription
  • How much of the genome is transcribed?
  • How do we know about pervasive transcription?
  • Different kinds of noncoding RNAs
  •         Box 4-1: Long noncoding RNAs (lncRNAs)
  • Understanding transcription
  •         Box 4-2: Revisiting the Central Dogma
  • What the scientific papers don’t tell you
  •         Box 4-3: John Mattick proves his hypothesis?
  • On the origin of new genes
  • The biggest blow to junk?
  •         Box 4-4: How do you tell if it’s functional?
  • Biochemistry is messy
  • Evolution as a tinkerer
  •         Box 4-5: Dealing with junk RNA
  • Change your worldview


Thursday, January 19, 2017

The pervasive transcription controversy: 2002

I'm working on a chapter about pervasive transcription and how it relates to the junk DNA debate. I found a short review in Nature from 2002 so I decided to see how much progress we've made in the past 15 years.

Most of our genome is transcribed at some time or another in some tissue. That's a fact we've known about since the late 1960s (King and Jukes, 1969). We didn't know it back then, but it turns out that a lot of that transcription is introns. In fact, the observation of abundant transcription led to the discovery of introns. We have about 20,000 protein-coding genes and the average gene is 37.2 kb in length. Thus, the total amount of the genome devoted to these genes is about 23%. That's the amount that's transcribed to produce primary transcripts and mRNA. There are about 5000 noncoding genes that contribute another 2% so genes occupy about 25% of our genome.

Thursday, July 28, 2016

You are junk

There's an article about junk DNA in the latest issue of New Scientist (July 27, 2016) [You are junk: Why it’s not your genes that make you human]. I've already discussed the false meme at the beginning of the article [False history and the number of genes: 2016]. Now it's time to look at the main argument.

The subtitle is ...
Genes make proteins make us – that was the received wisdom. But from big brains to opposable thumbs, some of our signature traits could come from elsewhere.
You can see where this is going. You start with a false paradigm, "Genes make proteins make us," then proceed to refute it. This is called "paradigm shafting."1

False history and the number of genes: 2016

There's an article about junk DNA in the latest issue of New Scientist. The title is: You are junk: Why it’s not your genes that make you human. The author is Colin Barras, a science writer from Michigan with a Ph.D. in paleontology.

He begins with .....
IT WAS a discovery that threatened to overturn everything we thought about what makes us human. At the dawn of the new millennium, two rival teams were vying to be the first to sequence the human genome. Their findings, published in February 2001, made headlines around the world. Back-of-the-envelope calculations had suggested that to account for the sheer complexity of human biology, our genome should contain roughly 100,000 genes. The estimate was wildly off. Both groups put the actual figure at around 30,000. We now think it is even fewer – just 20,000 or so.

"It was a massive shock," says geneticist John Mattick. "That number is tiny. It’s effectively the same as a microscopic worm that has just 1000 cells."

Wednesday, March 09, 2016

A 2004 kerfuffle over pervasive transcription in the mouse genome

The first drafts of the human genome sequence were published in 2001. There was still work to do on "finishing" the sequence but a lot of the International Human Genome Project (IHGP) team shifted to work on the mouse genome. The FANTOM Consortium and the RIKEN Genome Exploration Groups (I and II) published an analysis of mouse transcripts in December 2002.
Okazaki, Y., Furuno, M., Kasukawa, T., Adachi, J., Bono, H., Kondo, S., Nikaido, I., Osato, N., Saito, R., Suzuki, H. et al. (2002) Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature, 420:563-573. [doi: 10.1038/nature01266]

Only a small proportion of the mouse genome is transcribed into mature messenger RNA transcripts. There is an international collaborative effort to identify all full-length mRNA transcripts from the mouse, and to ensure that each is represented in a physical collection of clones. Here we report the manual annotation of 60,770 full-length mouse complementary DNA sequences. These are clustered into 33,409 ‘transcriptional units’, contributing 90.1% of a newly established mouse transcriptome database. Of these transcriptional units, 4,258 are new protein-coding and 11,665 are new non-coding messages, indicating that non-coding RNA is a major component of the transcriptome. 41% of all transcriptional units showed evidence of alternative splicing. In protein-coding transcripts, 79% of splice variations altered the protein product. Whole-transcriptome analyses resulted in the identification of 2,431 sense–antisense pairs. The present work, completely supported by physical clones, provides the most comprehensive survey of a mammalian transcriptome so far, and is a valuable resource for functional genomics.

Wednesday, March 02, 2016

When philosophers talk about genomes

Postgenomics is a compendium of twelve scholarly articles by philosophers and sociologists who write about the implication of the human genome sequence and subsequent work on interpreting the results. The volume is edited by Sarah Richardson, a professor in Social Sciences (History of Science) at Harvard University (Boston, Massachusetts, USA), and by Hallam Stevens, a professor of History at Nanyang Technology University in Singapore (Singapore).


The first essay is by Stevens and Richardson and it outlines the goal of the book.

Thursday, October 01, 2015

How many RNA molecules per cell are needed for function?

One of the issues in the junk DNA wars is the importance of all those RNAs that are detected in sensitive assays. About 90% of the human genome is complementary to RNAs that are made at some time in some tissue or other. Does this pervasive transcription mean that most of the genome is functional or are most of these transcripts just background noise due to accidental transcription?

Friday, August 28, 2015

Human Evolution: Genes, Genealogies and Phylogenies by Graeme Finlay

Human Evolution: Genes, Genealogies and Phylogenies was published in 2013 by Cambridge University Press. The author is Graeme Finlay, a cancer researcher at the University of Auckland, Auckland, New Zealand.

I first learned about this book from a book review published in the journal Evolution (Johnson, 2014). It sounded interesting so I bought a copy and read it.

There are four main chapters and each one covers a specific topic related to genomes and function. The topics are: Retroviruses, Transposons, Pseudogenes, and New Genes. There's lots and lots of interesting information in these chapters including an up-to-date summary of co-opted DNA that probably serves a biologically relevant function in our genome. This is the book to buy if you want a good review of the scientific literature on those topics.

Friday, July 24, 2015

John Parrington discusses genome sequence conservation

John Parrington has written a book called, The Deeper Genome: Why there is more to the human genome than meets the eye. He claims that most of our genome is functional, not junk. I'm looking at how his arguments compare with Five Things You Should Know if You Want to Participate in the Junk DNA Debate

There's one post for each of the five issues that informed scientists need to address if they are going to write about the amount of junk in you genome. This is the last one.

1. Genetic load
John Parrington and the genetic load argument
2. C-Value paradox
John Parrington and the c-value paradox
3. Modern evolutionary theory
John Parrington and modern evolutionary theory
4. Pseudogenes and broken genes are junk
John Parrington discusses pseudogenes and broken genes
5. Most of the genome is not conserved (this post)
John Parrington discusses genome sequence conservation

5. Most of the genome is not conserved

There are several places in the book where Parrington address the issue of sequence conservation. The most detailed discussion is on pages 92-95 where he discusses the criticisms leveled by Dan Graur against ENCODE workers. Parrington notes that 9% of the human genome is conserved and recognizes that this is a strong argument for function. It implies that >90% of our genome is junk.

Here's how Parrington dismisses this argument ...
John Mattick and Marcel Dinger ... wrote an article for the HUGO Jounral, official journal of the Human Genome Organisation, entitled "The extent of functionality in the human genome." ... In response to the accusation that the apparent lack of sequence conservation of 90 per cent of the genome means that it has no function, Mattick and Dinger argued that regulatory elements and noncoding RNAs are much more relaxed in their link between structure and function, and therefore much harder to detect by standard measures of function. This could mean that 'conservation is relative', depending on the type of genomic structure being analyzed.
In other words, a large part of our genome (~70%?) could be producing functional regulatory RNAs whose sequence is irrelevant to their biological function. Parrington then writes a full page on Mattick's idea that the genome is full of genes for regulatory RNAs.

The idea that 90% of our genome is not conserved deserves far more serious treatment. In the next chapter (Chapter 7), Parrington discusses the role of RNA in forming a "scaffold" to organize DNA in three dimensions. He notes that ...
That such RNAs, by virtue of their sequence but also their 3D shape, can bind DNA, RNA, and proteins, makes them ideal candidates for such a role.
But if the genes for these RNAs make up a significant part of the genome then that means that some of their sequences are important for function. That has genetic load implications and also implications about conservation.

If it's not a "significant" fraction of the genome then Parrington should make that clear to his readers. He knows that 90% of our genome is not conserved, even between individuals (page 142), and he should know that this is consistent with genetic load arguments. However, almost all of his main arguments against junk DNA require that the extra DNA have a sequence-specific function. Those facts are not compatible. Here's how he justifies his position ...
Those proposing a higher figure [for functional DNA] believe that conservation is an imperfect measure of function for a number of reasons. One is that since many non-coding RNAs act as 3D structures, and because regulatory DNA elements are quite flexible in their sequence constraints, their easy detection by sequence conservation methods will be much more difficult than for protein-coding regions. Using such criteria, John Mattick and colleagues have come up with much higher figures for the amount of functionality in the genome. In addition, many epigenetic mechanisms that may be central for genome function will not be detectable through a DNA sequence comparison since they are mediated by chemical modifications of the DNA and its associated proteins that do not involve changes in DNA sequence. Finally, if genomes operate as 3D entities, then this may not be easily detectable in terms of sequence conservation.
This book would have been much better if Parrington had put some numbers behind his speculations. How much of the genome is responsible for making functional non-coding RNAs and how much of that should be conserved in one way of another? How much of the genome is devoted to regulatory sequences and what kind of sequence conservation is required for functionality? How much of the genome is required for "epigenetic mechanisms" and how do they work if the DNA sequence is irrelevant?

You can't argue this way. More than 90% of our genomes is not conserved—not even between individuals. If a good bit of that DNA is, nevertheless, functional, then those functions must not have anything to do with the sequence of the genome at those specific sites. Thus, regions that specify non-coding RNAs, for example, must perform their function even though all the base pairs can be mutated. Same for regulatory sequences—the actual sequence of these regulatory sequences isn't conserved according to John Parrington. This requires a bit more explanation since it flies on the face of what we know about function and regulation.

Finally, if you are going to use bulk DNA arguments to get around the conflict then tell us how much of the genome you are attributing to formation of "3D entities." Is it 90%? 70%? 50%?