More Recent Comments

Showing posts sorted by relevance for query encode project. Sort by date Show all posts
Showing posts sorted by relevance for query encode project. Sort by date Show all posts

Thursday, December 15, 2016

Nature opposes misinformation (pot, kettle, black)

The lead editorial in last week's issue of Nature (Dec. 8, 2016) urges us to Take the time and effort to correct misinformation. The author (Phil Williamson) is a scientist whose major research interest is climate change and the issue he's addressing is climate change denial. That's a clear example of misinformation but there are other, more subtle, examples that also need attention. I like what he says in the opening paragraphs,

Most researchers who have tried to engage online with ill-informed journalists or pseudoscientists will be familiar with Brandolini’s law (also known as the Bullshit Asymmetry Principle): the amount of energy needed to refute bullshit is an order of magnitude bigger than that needed to produce it. Is it really worth taking the time and effort to challenge, correct and clarify articles that claim to be about science but in most cases seem to represent a political ideology?

I think it is. Challenging falsehoods and misrepresentation may not seem to have any immediate effect, but someone, somewhere, will hear or read our response. The target is not the peddler of nonsense, but those readers who have an open mind on scientific problems. A lie may be able to travel around the world before the truth has its shoes on, but an unchallenged untruth will never stop.
I've had a bit of experience trying to engage journalists who appear to be ill-informed. I've had little success in convincing them that their reporting leaves a lot to be desired.

I agree with Phil Williamson that challenging falsehoods and misrepresentation is absolutely necessary even if it has no immediate effect. Recently I posted a piece on the misrepresentations of the ENCODE results in 2007 and pointed a finger at Nature and their editors [The ENCODE publicity campaign of 2007]. They are responsible because they did not ensure that the main paper (Birney et al., 2007) was subjected to appropriate peer review. They are responsible because they promoted misrepresentations in their News article and they are responsible because they published a rather silly News & Views article that did little to correct the misrepresentations.

That was nine years ago. Nature never admitted they were partly to blame for misrepresenting the function of the human genome.

Sunday, September 09, 2012

Brendan Maher Writes About the ENCODE/Junk DNA Publicity Fiasco

Brendan Maher is a Feature Editor for Nature. He wrote a lengthy article for Nature when the ENCODE data was published on Sept. 5, 2012 [ENCODE: The human encyclopaedia]. Here's part of what he said,
After an initial pilot phase, ENCODE scientists started applying their methods to the entire genome in 2007. Now that phase has come to a close, signalled by the publication of 30 papers, in Nature, Genome Research and Genome Biology. The consortium has assigned some sort of function to roughly 80% of the genome, including more than 70,000 ‘promoter’ regions — the sites, just upstream of genes, where proteins bind to control gene expression — and nearly 400,000 ‘enhancer’ regions that regulate expression of distant genes.
I expect encyclopedias to be much more accurate than this.

As most people know by now, there are many of us who challenge the implication that 80% of the genome has a function (i.e it's not junk).1 We think the Consortium was not being very scientific by publicizing such a ridiculous claim.

The main point of Maher's article was that the ENCODE results reveal a huge network of regulatory elements controlling expression of the known genes. This is the same point made by the ENCODE researchers themselves. Here's how Brendan Maher expressed it.

The real fun starts when the various data sets are layered together. Experiments looking at histone modifications, for example, reveal patterns that correspond with the borders of the DNaseI-sensitive sites. Then researchers can add data showing exactly which transcription factors bind where, and when. The vast desert regions have now been populated with hundreds of thousands of features that contribute to gene regulation. And every cell type uses different combinations and permutations of these features to generate its unique biology. This richness helps to explain how relatively few protein-coding genes can provide the biological complexity necessary to grow and run a human being.
I think that much of this hype comes from a problem I've called The Deflated Ego Problem. It arises because many scientists were disappointed to discover that humans have about the same number of genes as many other species yet we are "obviously" much more complex than a mouse or a pine tree. There are many ways of solving this "problem." One of them is to postulate that humans have a much more sophisticated network of control elements in our genome. Of course, this ignores the fact that the genomes of mice and trees are not smaller than ours.

Sunday, July 19, 2015

The fuzzy thinking of John Parrington: pervasive transcription

Opponents of junk DNA usually emphasize the point that they were surprised when the draft human genome sequence was published in 2001. They expected about 100,000 genes but the initial results suggested less than 30,000 (the final number is about 25,0001. The reason they were surprised was because they had not kept up with the literature on the subject and they had not been paying attention when the sequence of chromosome 22 was published in 1999 [see Facts and Myths Concerning the Historical Estimates of the Number of Genes in the Human Genome].

The experts were expecting about 30,000 genes and that's what the genome sequence showed. Normally this wouldn't be such a big deal. Those who were expecting a large number of genes would just admit that they were wrong and they hadn't kept up with the literature over the past 30 years. They should have realized that discoveries in other species and advances in developmental biology had reinforced the idea that mammals only needed about the same number of genes as other multicellular organisms. Most of the differences are due to regulation. There was no good reason to expect that humans would need a huge number of extra genes.

That's not what happened. Instead, opponents of junk DNA insist that the complexity of the human genome cannot be explained by such a low number of genes. There must be some other explanation to account for the the missing genes. This sets the stage for at least seven different hypotheses that might resolve The Deflated Ego Problem. One of them is the idea that the human genome contains thousands and thousands of nonconserved genes for various regulatory RNAs. These are the missing genes and they account for a lot of the "dark matter" of the genome—sequences that were thought to be junk.

Here's how John Parrington describes it on page 91 of his book.
The study [ENCODE] also found that 80 per cent of the genome was generating RNA transcripts having importance, many were found only in specific cellular compartments, indicating that they have fixed addresses where they operate. Surely there could hardly be a greater divergence from Crick's central dogma than this demonstration that RNAs were produced in far greater numbers across the genome than could be expected if they were simply intermediates between DNA and protein. Indeed, some ENCODE researchers argued that the basic unit of transcription should now be considered as the transcript. So Stamatoyannopoulos claimed that 'the project has played an important role in changing our concept of the gene.'
This passage illustrates my difficulty in coming to grips with Parrington's logic in The Deeper genome. Just about every page contains statements that are either wrong or misleading and when he strings them together they lead to a fundamentally flawed conclusion. In order to critique the main point, you have to correct each of the so-called "facts" that he gets wrong. This is very tedious.

I've already explained why Parrington is wrong about the Central Dogma of Molecular Biology [John Avise doesn't understand the Central Dogma of Molecular Biology]. His readers don't know that he's wrong so they think that the discovery of noncoding RNAs is a revolution in our understanding of biochemisty—a revolution led by the likes of John A. Stamatoyannopoulos in 2012.

The reference in the book to the statement by Stamatoyannopoulos is from the infamous Elizabeth Pennisi article on ENCODE Project Writes Eulogy for Junk DNA (Pennisi, 2012). Here's what she said in that article ...
As a result of ENCODE, Gingeras and others argue that the fundamental unit of the genome and the basic unit of heredity should be the transcript—the piece of RNA decoded from DNA—and not the gene. “The project has played an important role in changing our concept of the gene,” Stamatoyannopoulos says.
I'm not sure what concept of a gene these people had before 2012. It appears that John Parrington is under the impression that genes are units that encode proteins and maybe that's what Pennisi and Stamatoyannopoulos thought as well.

If so, then perhaps the publicity surrounding ENCODE really did change their concept of a gene but all that proves is that they were remarkably uniformed before 2012. Intelligent biochemists have known for decades that the best definition of a gene is "a DNA sequence that is transcribed to produce a functional product."2 In other words, we have been defining a gene in terms of transcripts for 45 years [What Is a Gene?].

This is just another example of wrong and misleading statements that will confuse readers. If I were writing a book I would say, "The human genome sequence confirmed the predictions of the experts that there would be no more than 30,000 genes. There's nothing in the genome sequence or the ENCODE results that has any bearing on the correct understanding of the Central Dogma and there's nothing that changes the correct definition of a gene."

You can see where John Parrington's thinking is headed. Apparently, Parrington is one of those scientists who were completely unaware of the fact that genes could specify functional RNAs and completely unaware of the fact that Crick knew this back in 1970 when he tried to correct people like Parrington. Thus, Parrington and his colleagues were shocked to learn that the human genome only had only 25,000 genes and many of them didn't encode proteins. Instead of realizing that his view was wrong, he thinks that the ENCODE results overthrew those old definitions and changed the way we think about genes. He tries to convince his readers that there was a revolution in 2012.

Parrington seems to be vaguely aware of the idea that most pervasive transcription is due to noise or junk RNA. However, he gives his readers no explanation of the reasoning behind such a claim. Spurious transcription is predicted because we understand the basic concept of transcription initiation. We know that promoter sequences and transcription binding sites are short sequences and we know that they HAVE to occur a high frequency in large genomes just by chance. This is not just speculation. [see The "duon" delusion and why transcription factors MUST bind non-functionally to exon sequences and How RNA Polymerase Binds to DNA]

If our understanding of transcription initiation is correct then all you need is a activator transcription factor binding site near something that's compatible with a promoter sequence. Any given cell type will contain a number of such factors and they must bind to a large number of nonfunctional sites in a large genome. Many of these will cause occasional transcription giving rise to low abundance junk RNA. (Most of the ENCODE transcripts are present at less than one copy per cell.)

Different tissues will have different transcription factors. Thus, the low abundance junk RNAs must exhibit tissue specificity if our prediction is correct. Parrington and the ENCODE workers seem to think that the cell specificity of these low abundance transcripts is evidence of function. It isn't—it's exactly what you expect of spurious transcription. Parrington and the ENCODE leaders don't understand the scientific literature on transription initiation and transcription factors binding sites.

It takes me an entire blog post to explain the flaws in just one paragraph of Parrington's book. The whole book is like this. The only thing it has going for it is that it's better than Nessa Carey's book [Nessa Carey doesn't understand junk DNA].


1. There are about 20,000 protein-encoding genes and an unknown number of genes specifying functional RNAs. I'm estimating that there are about 5,000 but some people think there are many more.

2. No definition is perfect. My point is that defining a gene as a DNA sequence that encodes a protein is something that should have been purged from textbooks decades ago. Any biochemist who ever thought seriously enough about the definition to bring it up in a scientific paper should be embarrassed to admit that they ever believed such a ridiculous definition.

Pennisi, E. (2012) "ENCODE Project Writes Eulogy for Junk DNA." Science 337: 1159-1161. [doi:10.1126/science.337.6099.1159"]

Friday, March 21, 2014

Science still doesn't get it

The latest issue of Science contains an article by Yudhijit Bhattacharjee about Dan Graur and his critique of the ENCODE publicity disaster of September 2012. The focus of the article is on whether Dan's tone is appropriate when discussing science.

Let me remind you what Science published back on September 7, 2012. Elizabeth Pennisi announced that ENCODE had written the eulogy for junk DNA. She quoted one of the leading researchers ...

Saturday, February 11, 2017

What did ENCODE researchers say on Reddit?

ENCODE researchers answered a bunch of question on Reddit a few days ago. I asked them to give their opinion on how much junk DNA is in our genome but they declined to answer that question. However, I think we can get some idea about the current thinking in the leading labs by looking at the questions they did choose to answer. I don't think the picture is very encouraging. It's been almost five years since the ENCODE publicity disaster of September 2012. You'd think the researchers might have learned a thing or two about junk DNA since that fiasco.

The question and answer session on Reddit was prompted by award of a new grant to ENCODE. They just received 31.5 million dollars to continue their search for functional regions in the human genome. You might have guessed that Dan Graur would have a few words to say about giving ENCODE even more money [Proof that 100% of the Human Genome is Functional & that It Was Created by a Very Intelligent Designer @ENCODE_NIH].

Thursday, August 07, 2014

The Function Wars: Part IV

The world is not inhabited exclusively by fools and when a subject arouses intense interest and debate, as this one has, something other than semantics is usually at stake.
Stephan Jay Gould (1982)
This is my fourth post on the function wars.

The first post in this series covered the various definitions of "function" [Quibbling about the meaning of the word "function"]. In the second post I tried to create a working definition of "function" and I discussed whether active transposons count as functional regions of the genome or junk [The Function Wars: Part II]. I claim that junk DNA is DNA that is nonfunctional and it can be deleted from the genome of an organism without affecting its survival, or the survival of its descendants.

In the third post I discussed a paper by Rands et al. (2014) presenting evidence that about 8% of the human genome is conserved [The Function Wars: Part III]. This is important since many workers equate sequence conservation with function. It suggests that only 8% of our genome is functional and the rest is junk. The paper is confusing and I'm still not sure what they did in spite of the fact that the lead author (Chris Rands) helped us out in the comments. I don't know what level of sequence similarity they counted as "constrained." (Was it something like 35% identity over 100 bp?)

My position if is that there's no simple definition of function but sequence conservation is a good proxy. It's theoretically possible to have selection for functional bulk DNA that doesn't depend on sequence but, so far, there are no believable hypothesis that make the case. It is wrong to arbitrarily DEFINE function in terms of selection (for sequence) because that rules out all bulk DNA hypotheses by fiat and that's not a good way to do science.

So, if the Rands et al. results hold up, it looks like more that 90% of our genome is junk.

Let's see how a typical science writer deals with these issues. The article I'm selecting is from Nature. It was published online yesterday (Aug. 6, 2014) (Woolston, 2014). The author is Chris Woolston, a freelance writer with a biology background. Keep in mind that it was Nature that started the modern functions wars by falling hook-line-and-sinker for the ENCODE publicity hype. As far as I know, the senior editors have not admitted that they, and their reviewers, were duped.

Thursday, September 20, 2012

Are All IDiots Irony Deficient?

As I'm sure you can imagine, the Intelligent Design Creationists are delighted with the ENCODE publicity. This is a case where some expert scientists support one of their pet beliefs; namely, that there's no such thing as junk DNA. The IDiots tend not to talk about other expert evolutionary biologists who disagree with them—those experts are biased Darwinists or are part of a vast conspiracy to mislead the public.

You might think that distinguishing between these two types of expert scientists would be a real challenge and you would be right. Let's watch how David Klinghoffer manoeuvres through this logical minefield at: ENCODE Results Separate Science Advocates from Propagandists. He begins with ....
"I must say," observes an email correspondent of ours, who is also a biologist, "I'm getting a kick out of watching evolutionary biologists attack molecular biologists for 'hyping' the ENCODE results."

True, and equally enjoyable -- in the sense of confirming something you strongly suspected already -- is seeing the way the ENCODE news has drawn a bright line between voices in the science world that care about science and those that are more focussed on the politics of science, even as they profess otherwise.

Monday, September 10, 2012

Science Writes Eulogy for Junk DNA

Elizabeth Pennisi is a science writer for Science, the premiere American science journal. She's been writing about "dark matter" for years focusing on how little we know about most of the human genome and ignoring all of the data that says it's mostly junk [see SCIENCE Questions: Why Do Humans Have So Few Genes? ].

It doesn't take much imagination to guess what Elizabeth Pennisi is going to write when she heard about the new ENCODE Data. Yep, you guessed it. She says that the ENCODE Project Writes Eulogy for Junk DNA.

THEME

Genomes & Junk DNA
Let's look at the opening paragraph in her "eulogy."
When researchers first sequenced the human genome, they were astonished by how few traditional genes encoding proteins were scattered along those 3 billion DNA bases. Instead of the expected 100,000 or more genes, the initial analyses found about 35,000 and that number has since been whittled down to about 21,000. In between were megabases of “junk,” or so it seemed.

Thursday, September 06, 2012

What in the World Is Michael Eisen Talking About?

I've been trying to keep up with the ENCODE PR fiasco so I immediately click on a link to Michael Eisen's blog with the provocative title it is NOT junk. The article is: This 100,000 word post on the ENCODE media bonanza will cure cancer.

Michael Eisen is an evolutionary biologist at the University of California at Berkeley. He's best known, to me, as the brother of Jonathan Eisen.

Michael, like me and hundreds of other scientists, is upset by the ENCODE press releases. One of them is: Fast forward for biomedical research: ENCODE scraps the junk.
The hundreds of researchers working on the ENCODE project have revealed that much of what has been called 'junk DNA' in the human genome is actually a massive control panel with millions of switches regulating the activity of our genes. Without these switches, genes would not work – and mutations in these regions might lead to human disease. The new information delivered by ENCODE is so comprehensive and complex that it has given rise to a new publishing model in which electronic documents and datasets are interconnected.
Here's the interesting thing. Many of us are upset about the press releases and the PR because we don't think the ENCODE data disproves junk DNA. Michael Eisen's perspective is entirely different. He's upset because, according to him, junk DNA was discredited years ago.
The problems start before the first line ends. As the authors undoubtedly know, nobody actually thinks that non-coding DNA is ‘junk’ any more. It’s an idea that pretty much only appears in the popular press, and then only when someone announces that they have debunked it. Which is fairly often. And has been for at least the past decade. So it is more than just intellectually lazy to start the story of ENCODE this way. It is dishonest – nobody can credibly claim this to be a finding of ENCODE. Indeed it was a clear sense of the importance of non-coding DNA that led to the ENCODE project in the first place. And yet, each of the dozens of news stories I read on this topic parroted this absurd talking point – falsely crediting ENCODE with overturning an idea that didn’t need to be overturned.
Eisen is wrong, junk DNA is alive and well. In fact almost 90% of our genome is junk.

This is what makes science so much fun.


Sunday, September 09, 2012

The Random Genome Project

Sean Eddy is a old—well not too old—talk.origins fan. (Hi Sean!).

Because he's had all that training in how to think correctly, he gets the difference between junk DNA and functional DNA. Read his post at: ENCODE says what? (C'est what?).

Think about your answer to the Random Genome Project thought experiment.
So a-ha, there’s the real question. The experiment that I’d like to see is the Random Genome Project. Synthesize a hundred million base chromosome of entirely random DNA, and do an ENCODE project on that DNA. Place your bets: will it be transcribed? bound by DNA-binding proteins? chromatin marked?

Of course it will.

The Random Genome Project is the null hypothesis, an essential piece of understanding that would be lovely to have before we all fight about the interpretation of ENCODE data on genomes. For random DNA (not transposon-derived DNA, not coding, not regulatory), what’s our null expectation for all these “functional” ENCODE features, by chance alone, in random DNA?

(Hat tip to The Finch and Pea blog, a great blog that I hadn’t seen before the last few days, where you’ll find essentially the same idea.)


Monday, February 05, 2018

ENCODE's false claims about the number of regulatory sites per gene

Some beating of dead horses may be ethical, where here and there they display unexpected twitches that look like life.

Zuckerkandl and Pauling (1965)

I realize that most of you are tired of seeing criticisms of ENCODE but it's important to realize that most scientists fell hook-line-and-sinker for the ENCODE publicity campaign and they still don't know that most of the claims were ridiculous.

I was reminded of this when I re-read Brendan Maher's summary of the ENCODE results that were published in Nature on Sept. 6, 2012 (Maher, 2012). Maher's article appeared in the front section of the ENCODE issue.1 With respect to regulatory sequences he said ...
The consortium has assigned some sort of function to roughly 80% of the genome, including more than 70,000 ‘promoter’ regions — the sites, just upstream of genes, where proteins bind to control gene expression — and nearly 400,000 ‘enhancer’ regions that regulate expression of distant genes ... But the job is far from done, says [Ewan] Birney, a computational biologist at the European Molecular Biology Laboratory’s European Bioinformatics Institute in Hinxton, UK, who coordinated the data analysis for ENCODE. He says that some of the mapping efforts are about halfway to completion, and that deeper characterization of everything the genome is doing is probably only 10% finished.

Tuesday, August 25, 2015

The apophenia of ENCODE or Pangloss looks at the human genome

This is a paper in French by Casane et al. (2015). Most of you won't be able to read it but the English abstract gives you the gist of the argument. I had to look up "apophenia": "Apophenia has come to imply a universal human tendency to seek patterns in random information, such as gambling."
In September 2012, a batch of more than 30 articles presenting the results of the ENCODE (Encyclopaedia of DNA Elements) project was released. Many of these articles appeared in Nature and Science, the two most prestigious interdisciplinary scientific journals. Since that time, hundreds of other articles dedicated to the further analyses of the Encode data have been published. The time of hundreds of scientists and hundreds of millions of dollars were not invested in vain since this project had led to an apparent paradigm shift: contrary to the classical view, 80% of the human genome is not junk DNA, but is functional. This hypothesis has been criticized by evolutionary biologists, sometimes eagerly, and detailed refutations have been published in specialized journals with impact factors far below those that published the main contribution of the Encode project to our understanding of genome architecture. In 2014, the Encode consortium released a new batch of articles that neither suggested that 80% of the genome is functional nor commented on the disappearance of their 2012 scientific breakthrough. Unfortunately, by that time many biologists had accepted the idea that 80% of the genome is functional, or at least, that this idea is a valid alternative to the long held evolutionary genetic view that it is not. In order to understand the dynamics of the genome, it is necessary to re-examine the basics of evolutionary genetics because, not only are they well established, they also will allow us to avoid the pitfall of a panglossian interpretation of Encode. Actually, the architecture of the genome and its dynamics are the product of trade-offs between various evolutionary forces, and many structural features are not related to functional properties. In other words, evolution does not produce the best of all worlds, not even the best of all possible worlds, but only one possible world.


Casane, D., Fumey, J., et Laurenti, P. (2015) L’apophénie d’ENCODE ou Pangloss examine le génome humain. Med. Sci. (Paris) 31: 680-686. [doi: 10.1051/medsci/20153106023]

Thursday, December 31, 2020

On the importance of controls

When doing an exeriment, it's important to keep the number of variables to a minimum and it's important to have scientific controls. There are two types of controls. A negative control covers the possibility that you will get a signal by chance; for example, if you are testing an enzyme to see whether it degrades sugar then the negative control will be a tube with no enzyme. Some of the sugar may degrade spontaneoulsy and you need to know this. A positive control is when you deliberately add something that you know will give a positive result; for example, if you are doing a test to see if your sample contains protein then you want to add an extra sample that contains a known amount of protein to make sure all your reagents are working.

Lots of controls are more complicated than the examples I gave but the principle is important. It's true that some experiments don't appear to need the appropriate controls but that may be an illusion. The controls might still be necessary in order to properly interpret the results but they're not done because they are very difficult. This is often true of genomics experiments.

Wednesday, June 25, 2014

The Function Wars: Part I

This is Part I of the "Function Wars: posts. The second one is on The ENCODE legacy.1

Quibbling about the meaning of the word "function"

The world is not inhabited exclusively by fools and when a subject arouses intense interest and debate, as this one has, something other than semantics is usually at stake.
Stephan Jay Gould (1982)
The ENCODE Consortium tried to redefine the word “function” to include any biological activity that they could detect using their genome-wide assays. This was not helpful since it included a huge number of sites and sequences that result from spurious (nonfunctional) binding of transcription factors or accidental transcription of random DNA sequences to make junk RNA [see What did the ENCODE Consortium say in 2012?]..

I believe that this strange way of redefining biological function was a deliberate attempt to discredit junk DNA. It was quite successful since much of the popular press interpreted the ENCODE results as refuting or disproving junk DNA. I believe that the leaders of the ENCODE Consortium knew what they were doing when they decided to hype their results by announcing that 80% of the human genome is functional [see The Story of You: Encode and the human genome – video, Science Writes Eulogy for Junk DNA]..

The ENCODE Project, today, announces that most of what was previously considered as 'junk DNA' in the human genome is actually functional. The ENCODE Project has found that 80 per cent of the human genome sequence is linked to biological function.

[Google Earth of Biomedical Research]

Tuesday, March 13, 2018

Making Sense of Genes by Kostas Kampourakis

Kostas Kampourakis is a specialist in science education at the University of Geneva, Geneva (Switzerland). Most of his book is an argument against genetic determinism in the style of Richard Lewontin. You should read this book if you are interested in that argument. The best way to describe the main thesis is to quote from the last chapter.

Here is the take-home message of this book: Genes were initially conceived as immaterial factors with heuristic values for research, but along the way they acquired a parallel identity as DNA segments. The two identities never converged completely, and therefore the best we can do so far is to think of genes as DNA segments that encode functional products. There are neither 'genes for' characters nor 'genes for' diseases. Genes do nothing on their own, but are important resources for our self-regulated organism. If we insist in asking what genes do, we can accept that they are implicated in the development of characters and disease, and that they account for variation in characters in particular populations. Beyond that, we should remember that genes are part of an interactive genome that we have just begun to understand, the study of which has various limitations. Genes are not our essences, they do not determine who we are, and they are not the explanation of who we are and what we do. Therefore we are not the prisoners of any genetic fate. This is what the present book has aimed to explain.

Friday, January 04, 2013

Intelligent Design Creationists Choose ENCODE Results as the #1 Evolution Story of 2012

The folks over at Evolution News & Views (sic) have selected the ENCODE papers as the Number 1 evolution-related story of 2012. Naturally, they fell for the hype of the ENCODE/Nature publicity campaign as you can see from the blog title: Our Top 10 Evolution-Related Stories: #1, ENCODE Project Buries "Junk DNA".

Most of the article is just the reposting of an article by Casey Luskin but some anonymous editor has added ...
Editor's note: For the No. 1 slot among evolution-related news stories of 2012, this one was an easy pick. The publication of the ENCODE project results detonated what had been considered among the sturdiest defenses that Darwinian evolutionary theory could still fall back upon: "Junk DNA." Casey Luskin's initial reporting is featured below. See also our response to the ensuing controversy over ENCODE ("Why the Case for Junk DNA 2.0 Still Fails").
Normally I would make fun of the creationists for misunderstanding the real scientific results in the papers that were published last September but, in this case, there are lots of real scientists who fell into the same trap.

Even Science magazine selected the ENCODE results as a top-ten breakthrough and noted that 80% of the human genome now has a function [Science Magazine Chooses ENCODE Results as One of the Top Ten Breakthroughs in 2012]. Oh well, I guess I'll just have to be content to point out that many scientists are as stupid as many Intelligent Design Creationists!

I can still mock the creationists for claiming that "Darwinian evolutionary theory" supports junk DNA.


Tuesday, December 04, 2012

Sean Eddy on Junk DNA and ENCODE

Sean Eddy is a bioinformatics expert who runs a lab at the Howard Hughes Medical Institute (HHMI) Janelia Farm Research Campus in Virginia (USA).1 Sean was one of the many scientists who spoke out against the ENCODE misinterpretation of their own results [ENCODE says what?].

Most people now know that ENCODE did not disprove junk DNA (with the possible exception of creationists and a few kooks).

Sean has written a wonderful article for Current Biology where he explains in simple terms why there is abundant evidence for junk (i.e. nonfunctional) DNA [The C-value paradox, junk DNA and ENCODE] [preprint].

Here's a quotation from the article to pique your interest.
Recently, the ENCODE project has concluded that 80% of the human genome is reproducibly transcribed, bound to proteins, or has its chromatin specifically modified. In widespread publicity around the project, some ENCODE leaders claimed that this biochemical activity disproves junk DNA. If there is an alternative hypothesis, it must provide an alternative explanation for the data: for the C-value paradox, for mutational load, and for how a large fraction of eukaryotic genomes is composed of neutrally drifting transposon-derived sequence. ENCODE hasn’t done this, and most of ENCODE’s data don’t bear directly on the question. Transposon‑derived sequence is generally expected to be biochemically active by ENCODE’s definitions — lots of transposon sequences are inserted into transcribed genic regions, mobile transposons are transcribed and regulated, and genomic suppression of transposon activity requires DNA‑binding and chromatin modification.

The question that the ‘junk DNA’ concept addresses is not whether these sequences are biochemically ‘active’, but whether they’re there primarily because they’re useful for the organism.


1. More importantly, he's an alumnus of Talk.origins.

Monday, August 05, 2019

Religion vs science (junk DNA): a blast from the past

I was checking out the science books in our local bookstore the other day and I came across Evolution 2.0 by Perry Marshall. It was published in 2015 but I don't recall seeing it before.

The author is an engineer (The Salem Conjecture) who's a big fan of Intelligent Design. The book is an attempt to prove that evolution is a fraud.

I checked to see if junk DNA was mentioned and came across the following passages on pages 273-275. It's interesting to read them in light of what's happened in the past four years. I think that the view represented in this book is still the standard view in the ID community in spite of the fact that it is factually incorrect and scientifically indefensible.

Sunday, September 09, 2012

Ed Yong Updates His Post on the ENCODE Papers

For decades we've known that less than 2% of the human genome consists of exons and that protein encoding genes represent more than 20% of the genome. (Introns account for the difference between exons and genes.) [What's in Your Genome?]. There are about 20,500 protein-encoding genes in our genome and about 4,000 genes that encode functional RNAs for a total of about 25,000 genes [Humans Have Only 20,500 Protein-Encoding Genes]. That's a little less than the number predicted by knowledgeable scientists over four decades ago [False History and the Number of Genes]. The definition of "gene" is somewhat open-ended but, at the very least, a gene has to have a function [Must a Gene Have a Function?].

We've known about all kinds of noncoding DNA that's functional, including origins of replication, centromeres, genes for functional RNAs, telomeres, and regulatory DNA. Together these functional parts of the genome make up almost 10% of the total. (Most of the DNA giving rise to introns is junk in the sense that it is not serving any function.) The idea that all noncoding DNA is junk is a myth propagated by scientists (and journalists) who don't know their history.

We've known about the genetic load argument since 1968 and we've known about the C-Value "Paradox" and it's consequences since the early 1970's. We've known about pseudogenes and we've known that almost 50% of our genome is littered with dead transposons and bits of transposons. We've known that about 3% of our genome consists of highly repetitive DNA that is not transcribed or expressed in any way. Most of this DNA is functional and a lot of it is not included in the sequenced human genome [How Much of Our Genome Is Sequenced?]. All of this evidence indicates that most of our genome is junk. This conclusion is consistent with what we know about evolution and it's consistent with what we know about genome sizes and the C-Value "Paradox." It also helps us understand why there's no correlation between genome size and complexity.

Friday, December 13, 2019

The "standard" view of junk DNA is completely wrong

I was browsing the table of contents of the latest issue of Cell and I came across this ....
For decades, the miniscule protein-coding portion of the genome was the primary focus of medical research. The sequencing of the human genome showed that only ∼2% of our genes ultimately code for proteins, and many in the scientific community believed that the remaining 98% was simply non-functional “junk” (Mattick and Makunin, 2006; Slack, 2006). However, the ENCODE project revealed that the non-protein coding portion of the genome is copied into thousands of RNA molecules (Djebali et al., 2012; Gerstein et al., 2012) that not only regulate fundamental biological processes such as growth, development, and organ function, but also appear to play a critical role in the whole spectrum of human disease, notably cancer (for recent reviews, see Adams et al., 2017; Deveson et al., 2017; Rupaimoole and Slack, 2017).

Slack, F.J. and Chinnaiyan, A.M. (2019) The Role of Non-coding RNAs in Oncology. Cell 179:1033-1055 [doi: 10.1016/j.cell.2019.10.017]
Cell is a high-impact, refereed journal so we can safely assume that this paper was reviewed by reputable scientists. This means that the view expressed in the paragraph above did not raise any alarm bells when the paper was reviewed. The authors clearly believe that what they are saying is true and so do many other reputable scientists. This seems to be the "standard" view of junk DNA among scientists who do not understand the facts or the debate surrounding junk DNA and pervasive transcription.

Here are some of the obvious errors in the statement.
  1. The sequencing of the human genome did NOT show that only ~2% of our genome consisted of coding region. That fact was known almost 50 years ago and the human genome sequence merely confirmed it.
  2. No knowledgeable scientist ever thought that the remaining 98% of the genome was junk—not in 1970 and not in any of the past fifty years.
  3. The ENCODE project revealed that much of our genome is transcribed at some time or another but it is almost certainly true that the vast majority of these low-abundance, non-conserved, transcripts are junk RNA produced by accidental transcription.
  4. The existence of noncoding RNAs such as ribosomal RNA and tRNA was known in the 1960s, long before ENCODE. The existence of snoRNAs, snRNAs, regulatory RNAs, and various catalytic RNAS were known in the 1980s, long before ENCODE. Other RNAs such as miRNAs, piRNAS, and siRNAs were well known in the 1990s, long before ENCODE.
How did this false view of our genome become so widespread? It's partially because of the now highly discredited ENCODE publicity campaign orchestrated by Nature and Science but that doesn't explain everything. The truth is out there in peer-reviewed scientific publications but scientists aren't reading those papers. They don't even realize that their standard view has been seriously challenged. Why?