Sandwalk: Junk DNA

Showing posts with label Junk DNA. Show all posts

Friday, July 14, 2017

Revisiting the genetic load argument with Dan Graur

The genetic load argument is one of the oldest arguments for junk DNA and it's one of the most powerful arguments that most of our genome must be junk. The concept dates back to J.B.S. Haldane in the late 1930s but the modern argument traditionally begins with Hermann Muller's classic paper from 1950. It has been extended and refined by him and many others since then (Muller, 1950; Muller, 1966).

Debating alternative splicing (part II)

Mammalian genomes are very large. It looks like 90% of it is junk DNA. These genomes are pervasively transcribed, meaning that almost 90% of the bases are complementary to a transcript produced at some time during development. I think most of those transcripts are due to inappropriate transcription initiation. They are mistakes in transcription. The genome is littered with transcription factor binding sites but only a small percentage are directly involved in regulating gene expression. The rest are due to spurious binding—a well-known property of DNA binding proteins. These conclusions are based, I believe, on a proper understanding of evolution and basic biochemistry.

If you add up all the known genes, they cover about 30% of the genome sequence. Most of this (>90%) is intron sequence and introns are mostly junk. The standard mammalian gene is transcribed to produce a precursor RNA that is subsequently processed by splicing out introns to produce a mature RNA. If it's a messenger RNA (mRNA) then it will be translated to produce a protein (technically, a polypeptide). So far, the vast majority of protein-coding genes produce a single protein but there are some classic cases of alternative splicing where a given gene produces several different protein isoforms, each of which has a specific function.

Are most transcription factor binding sites functional?

The ongoing debate over junk DNA often revolves around data collected by ENCODE and others. The idea that most of our genome is transcribed (pervasive transcription) seems to indicate that genes occupy most of the genome. The opposing view is that most of these transcripts are accidental products of spurious transcription. We see the same opposing views when it comes to transcription factor binding sites. ENCODE and their supporters have mapped millions of binding sites throughout the genome and they believe this represent abundant and exquisite regulation. The opposing view is that most of these binding sites are spurious and non-functional.

The messy view is supported by many studies on the biophysical properties of transcription factor binding. These studies show that any DNA binding protein has a low affinity for random sequence DNA. They will also bind with much higher affinity to sequences that resemble, but do not precisely match, the specific binding site [How RNA Polymerase Binds to DNA; DNA Binding Proteins]. If you take a species with a large genome, like us, then a typical DNA protein binding site of 6 bp will be present, by chance alone, at 800,000 sites. Not all of those sites will be bound by the transcription factor in vivo because some of the DNA will be tightly wrapped up in dense chromatin domains. Nevertheless, an appreciable percentage of the genome will be available for binding so that typical ENCODE assays detect thousand of binding sites for each transcription factor.

This information appears in all the best textbooks and it used to be a standard part of undergraduate courses in molecular biology and biochemistry. As far as I can tell, the current generation of new biochemistry researchers wasn't taught this information.

John Mattick still claims that most lncRNAs are functional

Most of the human genome is transcribed at some time or another in some tissue or another. The phenomenon is now known as pervasive transcription. Scientists have known about it for almost half a century.

At first the phenomenon seemed really puzzling since it was known that coding regions accounted for less than 1% of the genome and genetic load arguments suggested that only a small percentage of the genome could be functional. It was also known that more than half the genome consists of repetitive sequences that we now know are bits and pieces of defective transposons. It seemed unlikely back then that transcripts of defective transposons could be functional.

Part of the problem was solved with the discovery of RNA processing, especially splicing. It soon became apparent (by the early 1980s) that a typical protein coding gene was stretched out over 37,000 bp of which only 1300 bp were coding region. The rest was introns and intron sequences appeared to be mostly junk.

Jonathan Wells illustrates zombie science by revisiting junk DNA

Jonathan Wells has written a new book (2017) called Zombie Science: More Icons of Evolution. He revisits his famous Icons of Evolution from 2000 and tries to show that nothing has changed in 17 years.

I wrote a book in 2000 about ten images images, ten "icons of evolution," that did not fit the evidence and were empirically dead. They should have been buried, but they are still with us, haunting our science classrooms and stalking our children. They are part of what I call zombie science.

I won't bore you with the details. The icons fall into two categories: (1) those that were meaningless and/or trivial in 2000 and remain so today, and (2) those that Wells misunderstood in 2000 and are still misunderstood by creationists today.

What did ENCODE researchers say on Reddit?

ENCODE researchers answered a bunch of question on Reddit a few days ago. I asked them to give their opinion on how much junk DNA is in our genome but they declined to answer that question. However, I think we can get some idea about the current thinking in the leading labs by looking at the questions they did choose to answer. I don't think the picture is very encouraging. It's been almost five years since the ENCODE publicity disaster of September 2012. You'd think the researchers might have learned a thing or two about junk DNA since that fiasco.

The question and answer session on Reddit was prompted by award of a new grant to ENCODE. They just received 31.5 million dollars to continue their search for functional regions in the human genome. You might have guessed that Dan Graur would have a few words to say about giving ENCODE even more money [Proof that 100% of the Human Genome is Functional & that It Was Created by a Very Intelligent Designer @ENCODE_NIH].

NIH and UCSF ENCODE researchers are on Reddit right now!

Check out Science AMA Series: We’re Drs. Michael Keefer and James Kobie, infectious .... (Thanks to Paul Nelson for alerting me to the discussion.)

Here's part of the introduction ...

Yesterday NIH announced its latest round of ENCODE funding, which includes support for five new collaborative centers focused on using cutting edge techniques to characterize the candidate functional elements in healthy and diseased human cells. For example, when and where does an element function, and what exactly does it do.

UCSF is host to two of these five new centers, where researchers are using CRISPR gene editing, embryonic stem cells, and other new tools that let us rapidly screen hundreds of thousands of genome sequences in many different cell types at a time to learn which sequences are biologically relevant — and in what contexts they matter.

Today’s AMA brings together the leaders of NIH’s ENCODE project and the leaders of UCSF’s partner research centers.

Your hosts today are:

Nadav Ahituv, UCSF professor in the department of bioengineering and therapeutic sciences. Interested in gene regulation and how its alteration leads to morphological differences between organisms and human disease. Loves science and juggling.
Elise Feingold: Lead Program Director, Functional Genomics Program, NHGRI. I’ve been part of the ENCODE Project Management team since its start in 2003. I came up with the project’s name, ENCODE!
Dan Gilchrist, Program Director, Computational Genomics and Data Science, NHGRI. I joined the ENCODE Project Management team in 2014. Interests include mechanisms of gene regulation, using informatics to address biological questions, surf fishing.
Mike Pazin, Program Director, Functional Genomics Program, NHGRI. I’ve been part of the ENCODE Project Management team since 2011. My background is in chromatin structure and gene regulation. I love science, learning about how things work, and playing music.
Yin Shen: Assistant Professor in Neurology and Institute for Human Genetics, UCSF. I am interested in how genetics and epigenetics contribute to human health and diseases, especial for the human brain and complex neurological diseases. If I am not doing science, I like experimenting in the kitchen.

Thursday, January 19, 2017

The pervasive transcription controversy: 2002

I'm working on a chapter about pervasive transcription and how it relates to the junk DNA debate. I found a short review in Nature from 2002 so I decided to see how much progress we've made in the past 15 years.

Most of our genome is transcribed at some time or another in some tissue. That's a fact we've known about since the late 1960s (King and Jukes, 1969). We didn't know it back then, but it turns out that a lot of that transcription is introns. In fact, the observation of abundant transcription led to the discovery of introns. We have about 20,000 protein-coding genes and the average gene is 37.2 kb in length. Thus, the total amount of the genome devoted to these genes is about 23%. That's the amount that's transcribed to produce primary transcripts and mRNA. There are about 5000 noncoding genes that contribute another 2% so genes occupy about 25% of our genome.

The ENCODE publicity campaign of 2007

ENCODE¹ published the results of a pilot project in 2007 (Birney et al., 2007). They looked at 1% (30Mb) of the genome with a view to establishing their techniques and dealing with large amounts of data from many different groups. The goal was to "provide a more biologically informative representation of the human genome by using high-throughput methods to identify and catalogue the functional elements encoded."

The most striking result of this preliminary study was the confirmation of pervasive transcription. Here's what the ENCODE Consortium leaders said in the abstract,

Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap with one another.

ENCODE concluded that 93% of the genome is transcribed in one tissue or another. There are two possible explanations that account for pervasive transcription.

Using conservation to determine whether splice variants are functional

We've been having a discussion about function and how to recognize it. This is important when it comes to determining how much junk is in our genome [see Restarting the function wars (The Function Wars Part V)]. There doesn't seem to be any consensus on how to define "function" although there's general agreement on using sequence conservation as a first step. If some sequence under investigation is conserved in other species then that's a good sign that it's under negative selection and has a biological function. What if it's not conserved? Does that rule out function? The correct answer is "no" because one can always come up with explanations/excuses for such an observation. We discussed the example of de novo genes, which, by definition, are not conserved.

Let's look at another example: splice variants. Splice variants are different forms of RNA produced from the same gene. If they are biologically relevant then they will produce different forms of the protein (for protein-coding genes). This is an example of alternative splicing if, and only if, relevance has been proven.

Restarting the function wars (The Function Wars Part V)

The term "function wars" refers to debates over the meaning of the word "function" in biology. It refers specifically to the discussion about junk DNA because junk DNA is defined as DNA that does not have a biological function. The wars were (re-)started when the ENCODE Consortium decided to use a stupid definition of function in order to prove that most of our genome was functional. This prompted a number of papers attempting to create a more meaningful definition.

None of them succeeded, in my opinion, because biology is messy and doesn't lend itself to precise definitions. Look how difficult it is to define a "gene," for example. Or "evolution."

Nevertheless, some progress was made. Dan Graur has recently posted a summary of the two most important definitions of function [What does “function” mean in the context of evolution & what absurd situations may arise by using the wrong definition?]. The two definitions are "selected-effect" and "causal-role" (there are synonyms).

More junk science in Science

The latest issue of the journal Science (Aug. 1, 2016) has an article on a recent paper by Aires et al. (2016) published in Developmental Cell. Here's the abstract of the paper ...

Vertebrates exhibit a remarkably broad variation in trunk and tail lengths. However, the evolutionary and developmental origins of this diversity remain largely unknown. Posterior Hox genes were proposed to be major players in trunk length diversification in vertebrates, but functional studies have so far failed to support this view. Here we identify the pluripotency factor Oct4 as a key regulator of trunk length in vertebrate embryos. Maintaining high Oct4 levels in axial progenitors throughout development was sufficient to extend trunk length in mouse embryos. Oct4 also shifted posterior Hox gene-expression boundaries in the extended trunks, thus providing a link between activation of these genes and the transition to tail development. Furthermore, we show that the exceptionally long trunks of snakes are likely to result from heterochronic changes in Oct4 activity during body axis extension, which may have derived from differential genomic rearrangements at the Oct4 locus during vertebrate evolution.

... those ignorant of history are not condemned to repeat it; they are merely destined to be confused.

Stephen Jay Gould
Ontogeny and Phylogeny (1977)The results were written up by a freelance journalist named Diana Crow [‘Junk DNA’ tells mice—and snakes—how to grow a backbone]. She writes ...

‘Junk DNA’ tells mice—and snakes—how to grow a backbone

Why does a snake have 25 or more rows of ribs, whereas a mouse has only 13? The answer, according to a new study, may lie in "junk DNA," large chunks of an animal’s genome that were once thought to be useless. The findings could help explain how dramatic changes in body shape have occurred over evolutionary history.

Scientists began discovering junk DNA sequences in the 1960s. These stretches of the genome—also known as noncoding DNA—contain the same genetic alphabet found in genes, but they don’t code for the proteins that make us who we are. As a result, many researchers long believed this mysterious genetic material was simply DNA debris accumulated over the course of evolution. But over the past couple decades, geneticists have discovered that this so-called junk is anything but. It has important functions, such as switching genes on and off and setting the timing for changes in gene activity.

Sandwalk readers will see all the mistakes and misconceptions in these paragraphs. She's talking about regulatory sequences that were never, ever, thought to be junk. The paper being discussed has nothing to do with junk DNA and the results do not in any way alter our understanding of developmental gene regulation.

If you look carefully at the abstract, you'll see the word "heterochronic." This is one of Stephen Jay Gould's favorite words. He wrote about it in Ontogeny and Phylogeny.

I wish to emphasize one other distinction. Evolution occurs when ontogeny is altered in one of two ways: when new characters are introduced at any stage of development with varying effects upon subsequent stages, or when characters already present undergo changes in developmental timing. Together, these two processes exhaust the formal concept of phyletic change.; the second process is heterochrony. [my emphasis ... LAM] If change in developmental timing is important in evolution, then this second process must be very common.

This was written in 1977—that's almost 40 years ago! These ideas were around for decades before Gould wrote his book¹ and they have been shown to be correct by numerous studies in the 1980s.

What's going on here? Science is supposed to be one of the leading science journals. How could it publish an article that misrepresents the field so badly? Do the editors send these "Latest News" articles out for review?

1. Ed Lewis shared the Nobel Prize in 1995 for his contribution to "the genetic control of early embryonic development" [The Nobel Prize in Physiology or Medicine 1995].

Monday, July 11, 2016

A genetics professor who rejects junk DNA

Praveen Sethupathy is a genetics professor at the University of North Carolina in Chapel Hill, North Carolina, USA.

He explains why he is a Christian and why he is "more than his genes" in Am I more than my genes? Faith, identity, and DNA.

Here's the opening paragraph ...

The word “genome” suggests to many that our DNA is simply a collection of genes from end-to-end, like books on a bookshelf. But it turns out that large regions of our DNA do not encode genes. Some once called these regions “junk DNA.” But this was a mistake. More recently, they have been referred to as the “dark matter” of our genome. But what was once dark is slowly coming to light, and what was once junk is being revealed as treasure. The genome is filled with what we call “control elements” that act like switches or rheostats, dialing the activation of nearby genes up and down based on whatever is needed in a particular cell. An increasing number of devastating complex diseases, such as cancer, diabetes, and heart disease, can often be traced back, in part, to these rheostats not working properly.

Do Intelligent Design Creationists still think junk DNA refutes ID?

I'm curious about whether Intelligent Design Creationists still think their prediction about junk DNA has been confirmed.

Here's what Stephen Meyer wrote in Darwin's Doubt (p. 400).

The noncoding regions of the genome were assumed to be nonfunctional detritus of the trial-and-error mutational process—the same process that produced the functional code in the genome. As a result, these noncoding regions were deemed "junk DNA," including by no less a scientific luminary than Francis Crick.

Because intelligent design asserts that an intelligent cause produced the genome, design advocates have long predicted that most of the nonprotein-coding sequences in the genome should perform some biological function, even if they do not direct protein synthesis. Design theorists do not deny that mutational processes might have degraded some previously functional DNA, but we have predicted that the functional DNA (the signal) should dwarf the nonfunctional DNA (the noise), and not the reverse. As William Dembski, a leading design proponent, predicted in 1998, "On an evolutionary view we expect a lot of useless DNA. If, on the other hand, organisms are designed, we DNA, as much as possible, to exhibit function."

I'm trying to write about this in my book and I want to be as fair as possible.

Do most ID proponents still believe this is an important prediction from ID theory?

Do most ID proponents still think that most of the human genome is functional?

Monday, May 16, 2016

Tim Minchin's "Storm," the animated movie, and another no-so-good Minchin cartoon

I've mentioned this before but it bears repeating. If you haven't listened to "Storm" then you are in for a treat because now you can listen AND watch. If you've heard it before, then hear it again. The message never gets old.

A word of caution. Minchin may be very good at recognizing pseudoscience and quacks but he can be a bit gullible when listening to scientists. He was completely take in by the ENCODE hype back in 2012. This cartoon is also narrated by Tim Minchin but it's not so good.

Monday, May 02, 2016

The Encyclopedia of Evolutionary Biology revisits junk DNA

The Enclyopedia of Evolutionary Biology is a four volume set of articles by leading evolutionary biologists. An online version is available at ScienceDirect. Many universities will have free access.

I was interested in what they had to say about junk DNA and the evolution of large complex genomes. The only article that directly addressed the topic was "Noncoding DNA Evolution: Junk DNA Revisited" by Michael Z. Ludwig of the Department of Ecology and Evolution at the University of Chicago. Ludwig is a Research Associate (Assistant Professor) who works with Martin Kreitman on "Developmental regulation of gene expression and the genetic basis for evolution of regulatory DNA."

As you could guess from the title of the article, Michael Ludwig divides the genome into two fractions; protein-coding genes and noncoding DNA. The fact that organismal complexity doesn't correlate with the number of genes (protein-coding) is a problem that requires an explanation, according to Ludwig. He assumes that the term "junk DNA" was used in the past to account for our lack of knowledge about noncoding DNA.

Eukaryotic genomes mostly consist of DNA that is not translated into protein sequence. However, noncoding DNA (ncDNA) has been little studied relative to proteins. The lack of knowledge about its functional significance has led to hypotheses that much nongenic DNA is useless "junk" (Ohno, 1972) or that it exists only to replicate itself (Doolittle and Sapienza, 1980; Orgel and Crick, 1980).

Ludwig says that we now know some of the functions of non-coding DNA and one of them is regulation of gene expression.

These regulatory sequences are distributed among selfish transposons and middle or short repetitive DNAs. The genome is an extremely complex machine; functionally as well as structurally it is generally not possible to disentangle the regulatory function from the junk selfish activity. The idea of junk DNA needs to be revisited.

Of course we all know about regulatory sequences. We've known about this function of non-coding DNA for half a century. The question that interests us is not whether non-coding DNA has a function but whether a large proportion of noncoding DNA is junk.

Ludwig seems to be arguing that a significant fraction of the mammalian genome is devoted to regulation. He doesn't ever specify what this fraction is but apparently it's large enough to "revisit" junk DNA.

The biggest obstacle to his thesis is the fact that only 8% of the human genome is conserved (Rands et al., 2014). Ludwig says that 1% of the genome is coding DNA and 7% "has a functional regulatory gene expression role" according to the Rands et al. study. This is somewhat misleading since Rands et al. specifically mention that not all of this conserved DNA will be regulatory.

All of this is consistent with a definition of function specifying that it must be under negative selection (i.e. conserved). It leads to the conclusion that about 90% of the human genome is junk. That doesn't require a re-evaluation of junk.

In order to "revisit" junk DNA, the proponents of the "complex machine" view of evolution must come up with plausible reasons why lack of sequence conservation does not rule out function. Ludwig offers up the standard rationales ...

Some ultra-conserved sequences don't seem to have a function and this "shows that the extent of sequence conservation is not a good predictor of the functional importance of a sequence."
The amount of conserved sequence depends on the alignment and alignment is difficult.
About 40%-70% of the noncoding DNA in Drosophila melanogaster is under functional constraint within the species but not between D. melanogaster and D. simulans. Therefore, some large fraction of functional regulatory sequences might only be conserved in the human lineage and it won't show up in comparisons between species. (Does this explain onions?)

The idea here is that there is rapid turnover of functional DNA binding sites required for regulation but the overall fraction of DNA devoted to regulation remains large. This explains why there doesn't seem to be a correlation between the amount of conserved DNA and the amount that can possibly be devoted to regulating gene expression. The argument implies that much more than 7% of the genome is required for regulation. The amount has to be >50% or so in order to justify overthrowing the concept of junk DNA.

That's a ridiculous number, but so is 7%. Imagine that "only" 7% of the genome is functionally involved in regulating expression of the protein-coding genes. That's 224 million base pairs of DNA or approximately 10 thousand base pairs of cis-regulatory elements (CREs) for every protein-coding gene.

There is no evidence whatsoever that even this amount (7%) of DNA is required for regulation but Ludwig would like to think that the actual amount is much greater. The lack of conservation is dismissed by assuming rapid turnover while conserving function and/or stabilizing selection on polymorphic sequences.

The problem here is that Ludwig is constructing a just-so evolutionary story to explain something that doesn't require an explanation. If there's no evidence that a large fraction of the genome is required for regulation then there's no problem that needs explaining. Ludwig does not tell us why he believes that most of our genome is required for regulation. Maybe it's because of ENCODE?

Since this is published in the Encyclopedia of Evolutionary Biolgoy, I assume that this sort of evolutionary argument resonates with many evolutionary biologists. That's sad.

Rands, C. M., Meader, S., Ponting, C. P., and Lunter, G. (2014) 8.2% of the Human Genome Is Constrained: Variation in Rates of Turnover across Functional Element Classes in the Human Lineage. PLoS Genetics, 10(7), e1004525. [doi: 10.1371/journal.pgen.1004525]

Sunday, March 27, 2016

Georgi Marinov reviews two books on junk DNA

The December issue of Evolution: Education and Outreach has a review of two books on junk DNA. The reviewer is Georgi Marinov, a name that's familiar to Sandwalk readers. He is currently working with Michael Lynch at Indiana University in Bloomington, Indiana, USA. You can read the review at: A deeper confusion.

The books are ...

The Deeper Genome: Why there is more to the human genome than meets the eye, by John Parrington, (Oxford, United Kingdom: Oxford University Press), 2015. ISBN:978-0-19-968873-9.

Junk DNA: A Journey Through the Dark Matter of the Genome, by Nessa Carey, (New York, United States: Columbia University Press), 2015. ISBN:978-0-23-117084-0.

You really need to read the review for yourselves but here's a few teasers.

If taken uncritically, these texts can be expected to generate even more confusion in a field that already has a serious problem when it comes to communicating the best understanding of the science to the public.

Parrington claims that noncoding DNA was thought to be junk and Georgi replies,

However, no knowledgeable person has ever defended the position that 98 % of the human genome is useless. The 98 % figure corresponds to the fraction of it that lies outside of protein coding genes, but the existence of distal regulatory elements, as nicely narrated by the author himself, has been at this point in time known for four decades, and there have been numerous comparative genomics studies pointing to a several-fold larger than 2% fraction of the genome that is under selective constraint.

I agree. That's a position that I've been trying to advertise for several decades and it needs to be constantly reiterated since there are so many people who have fallen for the myth.

Georgi goes on to explain where Parringtons goes wrong about the ENCODE results. This critique is devastating, coming, as it does, from an author of the most relevant papers.¹ My only complaint about the review is that George doesn't reveal his credentials. When he quotes from those papers—as he does many times—he should probably have mentioned that he is an author of those quotes.

Georgi goes on to explain four main arguments for junk DNA: genetic load, the C-value Paradox, transposons (selfish DNA), and modern evolutionary theory. I like this part since it's similar to the Five Things You Should Know if You Want to Participate in the Junk DNA Debate. The audience of this journal is teachers and this is important information that they need to know, and probably don't.

His critique of Nessa Carey's book is even more devastating. It begins with,

Still, despite a few unfortunate mistakes, The Deeper Genome is well written and gets many of its facts right, even if they are not interpreted properly. This is in stark contrast with Nessa Carey’s Junk DNA: A Journey Through the Dark Matter of the Genome. Nessa Carey has a PhD in virology and has in the past been a Senior Lecturer in Molecular Biology at Imperial College, London. However, Junk DNA is a book not written at an academic level but instead intended for very broad audience, with all the consequences that the danger of dumbing it down for such a purpose entails.

It gets worse. Nessa Carey claims that scientists used to think that all noncoding DNA was junk but recent discoveries have discredited that view. Georgi sets her straight with,

Of course, scientists have had a very good idea why so much of our DNA does not code for proteins, and they have had that understanding for decades, as outlined above. Only by completely ignoring all that knowledge could it have been possible to produce many of the chapters in the book. The following are referred to as junk DNA by Carey, with whole chapters dedicated to each of them (Table 3).

The inclusion of tRNAs and rRNAs in the list of “previously thought to be junk” DNA is particularly baffling given that they have featured prominently as critical components of the protein synthesis machinery in all sorts of basic high school biology textbooks for decades, not to mention the role that rRNAs and some of the other noncoding RNAs on that list play in many “RNA world” scenarios for the origin of life. How could something that has so often been postulated to predate the origin of DNA as the carrier of genetic information (Jeffares et al. 1998; Fox 2010) and that must have been of critical importance both before and after that be referred to as “junk”?

You would think that this is something that doesn't have to be explained to biology teachers but the evidence suggests otherwise. One of those teachers recently reviewed Nessa Carey's book very favorably in the journal The American Biology Teacher and another high school teacher reveals his confusion about the subject in the comments to my post [see Teaching about genomes using Nessa Carey's book: Junk DNA].

It's good that Georgi Marinov makes this point forcibly.

Now I'm going to leave you with an extended quote from Georgi Marinov's review. Coming from a young scientist, this is very potent and it needs to be widely disseminated. I agree 100%.

The reason why scientific results become so distorted on their way from scientists to the public can only be understood in the socioeconomic context in which science is done today. As almost everyone knows at this point, science has existed in a state of insufficient funding and ever increasing competition for limited resources (positions, funding, and the small number of publishing slots in top scientific journals) for a long time now. The best way to win that Darwinian race is to make a big, paradigm shifting finding. But such discoveries are hard to come by, and in many areas might actually never happen again—nothing guarantees that the fundamental discoveries in a given area have not already been made. ... This naturally leads to a publishing environment that pretty much mandates that findings are framed in the most favorable and exciting way, with important caveats and limitations hidden between the lines or missing completely. The author is too young to have directly experienced those times, but has read quite a few papers in top journals from the 1970s and earlier, and has been repeatedly struck by the difference between the open discussion one can find in many of those old articles and the currently dominant practices.

But that same problem is not limited to science itself, it seems to be now prevalent at all steps in the chain of transmission of findings, from the primary literature, through PR departments and press releases, and finally, in the hands of the science journalists and writers who report directly to the lay audience, and who operate under similar pressures to produce eye-catching headlines that can grab the fleeting attention of readers with ever decreasing ability to concentrate on complex and subtle issues. This leads to compound overhyping of results, of which The Deeper Genome is representative, and to truly surreal distortion of the science, such as what one finds in Nessa Carey’s Junk DNA.

The field of functional genomics is especially vulnerable to these trends, as it exists in the hard-to-navigate context of very rapid technological changes, a potential for the generation of truly revolutionary medical technologies, and an often difficult interaction with evolutionary biology, a controversial for a significant portion of society topic. It is not a simple subject to understand and communicate given all these complexities while in the same time the potential and incentives to mislead and misinterpret are great, and the consequences of doing so dire. Failure to properly communicate genomic science can lead to a failure to support and develop the medical breakthroughs it promises to deliver, or what might be even worse, to implement them in such a way that some of the dystopian futures imagined by sci-fi authors become reality. In addition, lending support to anti-evolutionary forces in society by distorting the science in a way that makes it appear to undermine evolutionary theory has profound consequences that given the fundamental importance of evolution for the proper understanding of humanity’s place in nature go far beyond making life even more difficult for teachers and educators of even the general destruction of science education. Writing on these issues should exercise the needed care and make sure that facts and their best interpretations are accurately reported. Instead, books such as The Deeper Genome and Junk DNA are prime examples of the negative trends outlined above, and are guaranteed to only generate even deeper confusion.

It's not easy to explain these things to a general audience, especially an audience that has been inundated with false information and false ideas. I'm going to give it a try but it's taking a lot more effort than I imagined.

1. Georgi Marinov is an author on the original ENCODE paper that claimed 80% of our genome is functional (ENCODE Project Consortium, 2012) and the paper where the ENCODE leaders retreated from that claim (Kellis et al., 2014).

ENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature, 48957-74. [doi: 10.1038/nature11247]

Kellis, M., Wold, B., Snyder, M.P., Bernstein, B.E., Kundaje, A., Marinov, G.K., Ward, L.D., Birney, E., Crawford, G.E., and Dekker, J. (2014) Defining functional DNA elements in the human genome. Proc. Natl. Acad. Sci. (USA) 111:6131-6138. [doi: 10.1073/pnas.1318948111]

Tuesday, March 22, 2016

How do you characterize these scientists?

We've been having a discussion on another thread about ID proponents. Are some of them acting in good faith or are they all lying and deceiving their followers?

I have similar problems about many scientists. I've been reading up on pervasive transcription and the potential number of genes for noncoding, functional, RNAs in the human genome. As far as I can tell, there are only a few hundred examples that have any supporting evidence. There are good scientific reasons to believe that most of the detected transcripts are junk RNA produced as the result of accidental, spurious, transcription.

There are about 20,000 protein-coding genes in the human genome. I think it's unlikely that there are more than a few thousand genes for functional RNAs for a total of less than 25,000 genes.

Here's one of the papers I found.

Guil, S. and Esteller, M. (2015) RNA–RNA interactions in gene regulation: the coding and noncoding players. Trends in Biochemical Sciences 40:248-256. [doi: 10.1016/j.tibs.2015.03.001]

Trends in Biochemical Sciences is a good journal and this is a review of the field by supposed experts. The authors are from the Department of Physiological Sciences II at the University of Barcelona School of Medicine in Barcelona, Catalonia, Spain. The senior author, Manel Esteller, has a Wikipedia entry [Manel Esteller].

Here's the first paragraph of the introduction.

There are more genes encoding regulatory RNAs than encoding proteins. This evidence, obtained in recent years from the sum of numerous post-genomic deep-sequencing studies, give a good clue of the gigantic step we have taken from the years of the central dogma: one gene gives rise to one RNA to produce one protein.

The first sentence is not true by any stretch of the imagination. The best that could be said is that there "may" be more genes for regulatory RNAs (> 20,000) but there's no strong consensus yet. Since the first sentence is an untruth, it follows that it is incorrect to say that the evidence supports such a claim.

It's also untrue to distort the real meaning of the Central Dogma of Molecular Biology, which never said that all genes have to encode proteins. The authors don't understand the history of their field in spite of the fact they are writing a review of that field.

Here's the problem. Are these scientists acting in good faith when they say such nonsense? Does acting in "good faith" require healthy criticism and critical thinking or is "honesty" the only criterion? The authors are clearly deluded about the controversy since they assume that it has been resolved in favor of their personal biases but they aren't lying. Can we distinguish between competent science and bad science based on such statements? Can we say that these scientists are incompetent or is that too harsh?

Furthermore, what ever happened to peer review? Isn't the system supposed to prevent such mistakes?

Wednesday, March 09, 2016

A 2004 kerfuffle over pervasive transcription in the mouse genome

The first drafts of the human genome sequence were published in 2001. There was still work to do on "finishing" the sequence but a lot of the International Human Genome Project (IHGP) team shifted to work on the mouse genome. The FANTOM Consortium and the RIKEN Genome Exploration Groups (I and II) published an analysis of mouse transcripts in December 2002.

Okazaki, Y., Furuno, M., Kasukawa, T., Adachi, J., Bono, H., Kondo, S., Nikaido, I., Osato, N., Saito, R., Suzuki, H. et al. (2002) Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature, 420:563-573. [doi: 10.1038/nature01266]

Only a small proportion of the mouse genome is transcribed into mature messenger RNA transcripts. There is an international collaborative effort to identify all full-length mRNA transcripts from the mouse, and to ensure that each is represented in a physical collection of clones. Here we report the manual annotation of 60,770 full-length mouse complementary DNA sequences. These are clustered into 33,409 ‘transcriptional units’, contributing 90.1% of a newly established mouse transcriptome database. Of these transcriptional units, 4,258 are new protein-coding and 11,665 are new non-coding messages, indicating that non-coding RNA is a major component of the transcriptome. 41% of all transcriptional units showed evidence of alternative splicing. In protein-coding transcripts, 79% of splice variations altered the protein product. Whole-transcriptome analyses resulted in the identification of 2,431 sense–antisense pairs. The present work, completely supported by physical clones, provides the most comprehensive survey of a mammalian transcriptome so far, and is a valuable resource for functional genomics.

Selfish genes and transposons

Back in 1980, the idea that large fractions of animal and plant genomes could be junk was quite controversial. Although the idea was consistent with the latest developments in population genetics, most scientists were unaware of these developments. They were looking for adaptive ways of explaining all the excess DNA in these genomes.

Some scientists were experts in modern evolutionary theory but still wanted to explain "junk DNA." Doolittle & Sapienza, and Orgel & Crick, published back-to-back papers in the April 17, 1980 issue of Nature. They explained junk DNA by claiming that most of it was due to the presence of "selfish" transposons that were being selected and preserved because they benefited their own replication and transmission to future generations. They have no effect on the fitness of the organism they inhabit. This is natural selection at a different level.

This prompted some responses in later editions of the journal and then responses to the responses.

Here's the complete series ...

Subscribe to: Posts ( Atom )

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)

Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)

Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Peter Atkins

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake. Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory. Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change. Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance. Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change. Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat. Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is True

I once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000

It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma

One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick

There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner

An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins

Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod

The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.

Richard Lewontin

More Recent Comments

Friday, July 14, 2017

Saturday, June 24, 2017

Thursday, June 22, 2017

Wednesday, June 21, 2017

Thursday, May 18, 2017

Saturday, February 11, 2017

Thursday, February 09, 2017

Thursday, January 19, 2017

Wednesday, December 14, 2016

Friday, December 09, 2016

Tuesday, December 06, 2016

Wednesday, August 03, 2016

Monday, July 11, 2016

Thursday, June 30, 2016

Monday, May 16, 2016

Monday, May 02, 2016

Sunday, March 27, 2016

Tuesday, March 22, 2016

Wednesday, March 09, 2016

Wednesday, November 25, 2015