Sandwalk: The quality of the modern scientific literature leaves much to be desired

Wednesday, October 21, 2015

The quality of the modern scientific literature leaves much to be desired

Lately I've been reading a lot of papers on genomes and I've discovered some really exceptional papers that discuss the existing scientific literature and put their studies in proper context. Unfortunately, these are the exceptions, not the rule.

I've discovered many more authors who seem to be ignorant of the scientific literature and far too willing to rely of the opinions of others instead of investigating for themselves. Many of these authors seem to be completely unaware of controversy and debate in the fields they are writing about. They act, and write, as if there was only one point of view worth considering, theirs.

How does this happen? It seems to me that it can only happen if they find themselves in an environment where skepticism and critical thinking are suppressed. Otherwise, how do you explain the way they write their papers? Are there no colleagues, post-docs, or graduate students who looked at the manuscript and pointed out the problems? Are there no referees who raised questions?

Let's look at a paper on functional elements in the human genome (Milligan and Lipovich, 2015). It wasn't published in a front-line journal but that shouldn't matter for the points I'd like to make. This is a review article so special rules apply. As a scientist, you are obliged to represent the field fairly and honestly when writing a review. Here's the abstract ...

In the more than one decade since the completion of the Human Genome Project, the prevalence of non-protein-coding functional elements in the human genome has emerged as a key revelation in post-genomic biology. Highlighted by the ENCODE (Encyclopedia of DNA Elements) and FANTOM (Functional Annotation of Mammals) consortia, these elements include tens of thousands of pseudogenes, as well as comparably numerous long non-coding RNA (lncRNA) genes. Pseudogene transcription and function remain insufficiently understood. However, the field is of great importance for human disease due to the high sequence similarity between pseudogenes and their parental protein-coding genes, which generates the potential for sequence-specific regulation. Recent case studies have established essential and coordinated roles of both pseudogenes and lncRNAs in development and disease in metazoan systems, including functional impacts of lncRNA transcription at pseudogene loci on the regulation of the pseudogenes’ parental genes. This review synthesizes the nascent evidence for regulatory modalities jointly exerted by lncRNAs and pseudogenes in human disease, and for recent evolutionary origins of these systems.

The authors are, of course, entitled to their opinion but they are not entitled to state it as if it were a fact. I do not believe that the prevalence of non-coding functional elements is a key "revelation" of the past 15 years.

For one thing, those elements that truly are functional were known BEFORE the human genome was sequenced. For another, it's not true, in my opinion, that there are huge amounts of functional DNA in the human genome. Any scientist who has kept up with the literature will know that the conclusions of the ENCODE Consortium and FANTOM are not universally accepted so they should not be quoted in an abstract as if they were necessarily true.

It would be okay to say something like this, "We believe that ENCODE and FANTOM have demonstrated that much of the human genome is functional but we will review and report contrary evidence and opinions."

The authors say that "tens of thousands of pseudgoenes" are functional but there's no evidence at all that this is true. They also say that a similar number of lncRNA elements are functional but, again, there is no evidence that this is true. There may be lots of people who like to think that tens of thousands of DNA elements are functional (i.e. genes) because they produce functional RNAs but wishing is not evidence.

It would be okay to say, "After an extensive review of the literature we conclude that tens of thousands of pseudogenes, and a similar number of lncRNAs, are functional although we recognize that most scientists will disagree with our opinion."

There's a more fundamental problem with this abstract and it has to do with the connections between genome activities and disease. The implicit assumption in this paper, and in many other papers, is that the locus of disease-causing mutations pinpoints functional regions of the genome. This is not correct. You could easily have a mutation that enhances transcription in a junk DNA region and the aberrant transcription interferes with the expression of a nearby gene. An example might be a spurious mutation that leads to transcription of an adjacent pseudogene from the opposite strand and the resulting antisense RNA blocks translation of the mRNA from the active gene. That does not mean that the junk DNA and the pseudogene now have a function.

You can also have a mutation in the junk DNA part of a large intron creating a new spice site leading to splicing errors that shut down proper gene expression. This does not mean that the site of the mutation has a function and can no longer be considered junk. We need to recognize that many disease-causing mutations might occur in junk DNA. These go by the unfortunate name of "gain-of-function" mutations.

The Milligan & Lipovich paper begins with ....

Redefining the Human Gene Count

Classical definitions of genes focus on heritable sequences of nucleic acids which can encode a protein (White et al., 1994).

You can guess where this is going. The authors are going to make the case that new data has forced us to recognize that there are genes for functional RNAs that don't encode proteins. This is a standard approach for a certain group of scientists who want to defend ENCODE and the functionality of most of our genome.

The set-up requires you to believe that during the 1990s everyone thought that the only kind of genes were those that encoded proteins. This is not true, but it is a misrepresentation of the truth that seems to be widely believed. I can assure you that knowledgeable scientists have known about genes for ribosomal RNAs and tRNAs for half-a-century and we've known about a host of other genes for functional RNAs for thirty years.

It may be the case that Michael Milligan and Leonard Lipovich were ignorant of non-protein-coding genes until very recently but it's not fair to imply that this misconception was shared by most knowledgeable scientists.

The reference (White et al., 1994) was not something I recognized so I tried to look it up. After a bit of searching I realized that the order of authors was incorrect and the real reference is Fields et al. (1994). It's a News & Views article in Nature Genetics entitled "How many genes in the human genome?" The authors are from Craig Venter's ~~company~~ private research institute (The Institute for Genomic Research, TIGR) and they include Craig Venter. At the time, TIGR was trying to determine the sequences of human genes.

Fields et al. know that defining the word "gene" is important so they say ...

Counting genes requires being clear about what counts as a gene. "Gene" is a notoriously slippery concept, and differing notions about what it means to identify one can lead to heated disagreements. Some define a gene physically as a region of DNA sequence containing a transcription unit and the associated regulatory sequences.

They refer to genes for small regulatory RNAs but decide to focus on transcription units that can be translated into proteins in the rest of their discussion.

It's not clear to me why Milligan & Lipovich use this reference to bolster their claim that "classical" definitions of genes focus on genes that encode proteins unless they mean that Fields et al were aware of the proper definition of gene but decided to restrict their count to protein-coding genes. (See What Is a Gene? for a more thorough discussion.)

Milligan & Lipovich continue the Introduction with ...

The question of how many genes the human genome contains has been an evolving point of contention since before the Human Genome Project. In 1994, the estimated total human protein-coding gene count was 64,000–71,000 genes (White et al., 1994). The higher gene estimate was based on partial genome sequencing, GC content, and genome size. The lower bound of 64,000 took into account expressed sequence tags (ESTs) and CpG islands as additional prediction factors. In 2000, a new count of actively transcribed genes was estimated at 120,000 using the TIGR Gene Index, based on ESTs, with the results from the Chromosome 22 Sequencing Consortium (Liang et al., 2000). 1 year later, Celera arrived at only 26,500–38,600 protein-coding genes using their completed human genome and comparative mouse genomics (Venter et al., 2001). The Human Genome Project, which used tiling-path sequencing as opposed to Celera’s shotgun sequencing, converged on a similar estimate (Lander et al., 2001).

Stories like this have become standard fare in many papers these days. It's an example of the fallacy that if you repeat a lie often enough it becomes accepted as truth. Here's the truth ...

The Milligan & Lipovich paper is also a clear example of laziness. The authors (Milligan & Lipovich) are satisfied with repeating a myth instead of doing their own research into what knowledgeable scientist really thought about the number of genes in the human genome.

At least in this case the authors have read an "ancient" paper from 1994. It's the Fields et al. paper that I talked about above only they refer to it as White et al. (1994). It's actually a pretty good paper on the number of genes. They discuss estimates ranging from 14,000 to 100,000 recognizing that the problem was difficult. Unfortunately they don't discuss any of the genetic load predictions.

Fields et al. (1994) figure there are between 60,000 and 70,000 protein-coding genes in the human genome. But just because some people thought that there were so many genes doesn't mean that this was the value universally accepted by all knowledgeable scientists.

By the time the complete draft human genome sequences was published we already knew the sequences of chromosomes 21, and 22 and the gene frequency in these chromosomes gave rise to predictions of 40,000 to 45,000 genes in the whole genome (see Aparicio, 2000). These were likely to be overestimates since both of these small chromosomes are rich in genes compared to the rest of the genome. (At the time we didn't know that the algorithms for counting genes returned many false positives.) That means that the gene count was approaching the numbers estimated earlier (about 30,000, if you only count knowledgeable scientists).

I find it interesting that Milligan & Lipovich take a different view of the history, saying that the estimates from chromosomes 21, and 22 predicted 120,000 genes. Their reference is Liang et al., (2000). It's true that Liang et al. worked at TIG and it's true that their estimate was 120,000. However, that paper is in the same issue of Nature Genetics as the Aparico (2000) paper I just quoted and two papers by Ewing and Green (2000) and Roest Crollius et al. (2000). The Ewing and Green estimate is 35,000 genes. The Roest Crolius et al. estimate is 28-34,000 genes. The papers were part of an issue on "The Nature of the Number."

So even if your version of ancient history only extends back to 1994, it's clear that by 2000 (one year before publication of the draft human genome sequence) most knowledgeable scientists—even those who were ignorant of the real ancient history from the 1960s—were thinking that the human genome had about 30,000 genes.

You may be wondering, as I did, why Milligan & Lipovich want to make a point about historical estimates of gene number when we already know the correct answer. I'm not sure why they think it's important. Clearly it's not important enough for them to have done a critical job of describing that history. Based on what I've seen in other papers, this sort of introduction seems designed to show you that there is a lot of "missing information" in the genome since scientists were expecting many more genes.

These are estimates of protein-coding genes. That's not because knowledgeable scientists didn't know about any other genes, it's because recognizing genes for functional RNAs is much more difficult. Samuel Aparicio explained it very nicely 15 years ago (Aparicio, 2000) ...

Although the tendency (especially in a pay-per-sequence access mode) is to assume that any transcript represents a gene, classical genetics demands some evidence of associated function. Crucially, what is not yet established (but is implied to be relatively abundant by these studies) is the extent of biological "noise" in the transcriptome of any given cell. In other words, what fraction of transcripts which can be isolated have any meaningful function? What fraction might be mere by-products of spurious transcription, spuriously fired off, perhaps on the antisense strand from promoters or CpG islands associated with protein coding genes (as seems to be the case with a number of imprinted genes)?

Lots and lots of scientists have expressed this cautionary view but no matter how many times it's published there are many more scientists who ignore the warning and continue to ignore it to this day. It's not a question of whether, in your opinion, the transcripts are functional in spite of the potential problems, it's that too many scientists won't even recognize that there's a problem.

Let's see the next paragraph in the Milligan & Lipovich paper.

Following the sequencing of the human genome, focus has shifted toward understanding gene function. In 2005, the FANTOM (Functional Annotation of Mammals) Consortium determined that the mouse genome harbored more non-coding genes than coding genes (Carninci and Hayashizaki, 2007). In a parallel project to FANTOM, the ENCODE (Encyclopedia of DNA Elements) Consortium began exhaustively surveyed the epigenetics and regulation of the whole genome (Birney et al., 2007; Consortium ENCODE Project, 2012). ENCODE’s continuing effort to recount human genes (GENCODE) using the study of genetic landmarks indicative of transcription and next generation sequencing has allowed them to arrive at a current total of just under 58,000 genes as of 2013 (gencodegenes.org). Of these 58,000 genes ENCODE only defines approximately 20,000 genes as coding, with almost all of the other genes being classified as pseudogenes and non-coding RNA (ncRNA). Early studies of the mouse transcriptome by the FANTOM Consortium first motivated the redefinition of a gene into a transcriptional unit as a consequence of large numbers of lncRNA genes discovered (Carninci et al., 2005).

Things are beginning to fall into place in this paper. The authors want you to believe that historical gene number estimates were much higher than the actual number of genes observed when the human genome sequence was published. That's because scientists thought that the only kind of genes were those that encode proteins, according to the myth. However, recent discoveries by ENCODE and FANTOM show that those scientists were wrong and there are actually genes for noncoding RNAs. Futhermore, those RNA genes outnumber the protein-coding genes by a large margin (38,000 to 20,000).

The caution expressed by Aparicio, and many others, is ignored. The rest of the paper consists of reviews of lncRNA functions and pseudogene functions. With respect to lncRNAs, there's no discussion of whether these lncRNAs represent "noise" and no critical review of the case for function. Even lack of conservation doesn't phase Milligan & Lipovich because these nonconserved genes for lncRNAs are still exaptive—they can easily become important functioning genes. As reservoirs for future change, they are "not disposable even when adaptation doesn't govern their existence."

Contrast this biased review with a review of lncRNAs published by my colleagues Alex Palazzo and Eliza Lee in the same journal a month earlier (Palazzo and Lee, 2015). They review the literature with a critical eye and conclude that ...

The genomes of large multicellular eukaryotes are mostly comprised of non-protein coding DNA. Although there has been much agreement that a small fraction of these genomes has important biological functions, there has been much debate as to whether the rest contributes to development and/or homeostasis. Much of the speculation has centered on the genomic regions that are transcribed into RNA at some low level. Unfortunately these RNAs have been arbitrarily assigned various names, such as “intergenic RNA,” “long non-coding RNAs” etc., which have led to some confusion in the field. Many researchers believe that these transcripts represent a vast, unchartered world of functional non-coding RNAs (ncRNAs), simply because they exist. However, there are reasons to question this Panglossian view because it ignores our current understanding of how evolution shapes eukaryotic genomes and how the gene expression machinery works in eukaryotic cells. Although there are undoubtedly many more functional ncRNAs yet to be discovered and characterized, it is also likely that many of these transcripts are simply junk. Here, we discuss how to determine whether any given ncRNA has a function. Importantly, we advocate that in the absence of any such data, the appropriate null hypothesis is that the RNA in question is junk.

I know for a fact that the Palazzo and Lee manuscript was reviewed by a number of knowledgeable and skeptical scientists before it was sent off. They even sent it to an old curmudgeon who criticizes everything.¹

The question is, why didn't the Milligan & Lipovich paper get the same scrutiny before they sent it off to the journal?

The other part of the Milligan & Lipovich paper discusses possible functions of pseudogenes. Again, there's a remarkable lack of critical thinking. The only case presented is the case for function. There's no attempt whatsoever to critically analyze and defend their claim in the abstract and introduction that "... the prevalence of non-protein-coding functional elements in the human genome has emerged as a key revelation in post-genomic biology." It's a classic case of confirmation bias and this isn't supposed to happen in the scientific literature, especially in reviews.

1. They didn't need to change any of their main points in response to reviewers because they already knew how to read and interpret the literature correctly.

Aparicio, S.A.J.R. (2000) How to count… human genes. Nature Genetics, 25:129-130. [doi:10.1038/75949]

Ewing, B., and Green, P. (2000) Analysis of expressed sequence tags indicates 35,000 human genes. Nat Genet. 25:232-234. [doi:10.1038/76115]

Fields, C., Adams, M.D., Whte, O., and Venter, J.C. (1994) How many genes in the human genome? Nature Genetics, 7:345-346. [PDF]

Liang, F., Holt, I., Pertea, G., Karamycheva, S., Salzberg, S.L., and Quackenbush, J. (2000) Gene Index analysis of the human genome estimates approximately 120,000 genes. Nat Genet, 25:239-240. [doi:10.1038/76126]

Milligan, M.J., and Lipovich, L. (2014) Pseudogene-derived lncRNAs: emerging regulators of gene expression. Frontiers in Genetics, 5: [doi: 10.3389/fgene.2014.00476]

Palazzo, A.F., and Lee, E.S. (2015) Non-coding RNA: what is functional and what is junk? Frontiers in Genetics, 6:2 [doi: 10.3389/fgene.2015.00002]

Pertea, M., and Salzberg, S. (2010) Between a chicken and a grape: estimating the number of human genes. Genome Biology, 11:206. [doi:10.1186/gb-2010-11-5-206]

Roest Crollius, H., Jaillon, O., Bernot, A., Dasilva, C., Bouneau, L., Fischer, C., Fizames, C., Wincker, P., Brottier, P., Quetier, F., Saurin, W., and Weissenbach, J. (2000) Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence. Nat Genet, 25(2), 235-238. [doi:10.1038/76118]

60 comments :

Jonathan Badger said...: The authors are from Craig Venter's company (The Institute for Genomic Research, TIGR) and they include Craig Venter. At the time, TIGR was trying to determine the sequences of human genes.

TIGR wasn't a company -- you may be confusing it with the later Celera, which was. TIGR was a non-profit research institute (it was the forerunner of the current JCVI).; Wednesday, October 21, 2015 3:24:00 PM
Larry Moran said...: Thanks. I wasn't confusing it with Celera but I thought it was organized as a company.; Wednesday, October 21, 2015 4:13:00 PM
Donald Forsdyke said...: Given bases A, C, G, T,
All linked up to infinity,
And given, little, little, me,
Confined to a set, not free.
There’s a need for strategy,
Predatory others for to see.

Predators’ sequences differ from me
One might read G,T,C,C,A,C.
All I need is G,T,G,G,A,G.
Complement catches, now you, I see.

No use burying head in sand,
With you I’ve formed a double strand!
Now I’ll put you in your place,
Then bestow the coup de grace.

Lucky I had GTGGAG around
The bell of doom for you to sound.
With me you found no bonne homie,
But GTGGAG’s where in economy?

My genome set is far from free,
Minute compared with the infinity,
From which predators unpredictably,
Emerge to challenge, challenge, me!

Sometime GTGGAG perchance,
Plays a role in my life’s dance.
But given foes’ infinity,
Need more to serve my liberty.

Seems a burden, but I’m glad,
My genome set, for to add,
Multitudes like GTGGAG,
To double-strand with enemy.

Tortoise-like, I’m no more nimble,
Forsaken genome that’s like thimble.
No longer genome lean and svelte,
For stalling pathogen, my cards were dealt.

Lumbering genome, quite a hunk,
Extra sequence some scoff as junk.
But I know that one day,
Thankful I’ll be f’that DNA.

Shuffled and variable within nations,
Self DNA tuned o’er generations,
Not to double-strand with me,
Only with those that don’t agree.

And subtle, subtle, even more,
Pathogen strategy does not ignore,
It morphs along with stepwise stealth,
To close approach that which is self.

DNA tuned for most hostilities,
Needs match finite possibilities.
Sequences near, but not quite, self,
Here’s where junk has mostest wealth.

Pathogen that inward flies,
Adopting a near-self guise,
Rapid brought to its senses
Cannot o’erleap host defences,; Wednesday, October 21, 2015 10:22:00 PM
peer said...: "The authors are, of course, entitled to their opinion but they are not entitled to state it as if it were a fact. "

Why not? How many authors out there, knowing nothing about biology, claim evolution is a fact? Still they, and you, cannot even point out the ancestor for chimp and man.

Of course I fully agree to your opinion, but it should account for all disciplines of science.; Thursday, October 22, 2015 6:34:00 AM
Mikkel Rumraket Rasmussen said...: What was the name of your great great great great great great great great great great great great great great grandfather, where did he live and how many children did he have?; Thursday, October 22, 2015 7:09:00 AM
peer said...: This simply illustrates that you really do not have clue, right?; Thursday, October 22, 2015 7:53:00 AM
judmarc said...: Still they, and you, cannot even point out the ancestor for chimp and man.

Ah, so you mean they don't pretend to know the answer when the evidence is not there?

What a novel idea. Someone should really take this up and call it something, like maybe "science."; Thursday, October 22, 2015 8:47:00 AM
Eelco van Kampen said...: Of course Mikkel does have a clue. You can't even name your great great great great great great great great great great great great great great grandfather, so what about a million times 'great' ?

And indeed: to date, no fossil has been identified as a potential candidate for the CHLCA. Not surprising, and a pity too, of course.; Thursday, October 22, 2015 8:49:00 AM
peer said...: Eelco is the mindreader of Mikkel, like he was able to review my book before it had been released.

He's a psychic...; Thursday, October 22, 2015 9:01:00 AM
peer said...: As a matter of fact, I described the ancestor of chimp and man in my book...Eelco. You missed it while mind-reading it?; Thursday, October 22, 2015 9:04:00 AM
peer said...: Frontloading and the new biology...

With 100 mutations per genome, the copying fidelity (CF) is 1- 1/30.000.000 = 1-0.00000003 = 0.99999997

If copyig fidelity had to evolve gradually, as Moran and his followers here believe, wat would happen with organisms having a CF of say 0.90 or 0.99?

They would die because of genomic meltdown within <10 generations, resp. <25 generations.

Genomes had to start with nearly perfect CF.

There is no way out. Genomes were frontloaded.

Frontloading is the new scientific theory from which we can understand biology.; Thursday, October 22, 2015 9:49:00 AM
Eelco van Kampen said...: You wrote a book, Peter ? No way !; Thursday, October 22, 2015 9:57:00 AM
Eelco van Kampen said...: Review ? I did not write a review - you keep on saying that.
I wrote a one-line warning, which still very much applies.; Thursday, October 22, 2015 10:02:00 AM
peer said...: As you know Eelco, my book was published originally in english, sequentially in the JoC, a peer-reviewed science journal.

Then in dutch.

Soon it will also be availabe in german. Kutschera will be my next opponent, for sure.; Thursday, October 22, 2015 10:12:00 AM
Eelco van Kampen said...: I had no idea, Peter. No idea at all ! You never ever told anyone about your book, as you are such a modest person.

" ... the JoC, a peer-reviewed science journal"
Ah, your sense of humour is still alive !; Thursday, October 22, 2015 10:21:00 AM
peer said...: A warning not to buy the book...! Isn't that incredible, dear readers?

Why would he do such folly?

Because my book scientifically demonstrated where and how biology falsifies Darwinian filosofies, such as universal common descent and random mutations as the engine for evolution.

In other words, Eelco, was afraid that my book might lead to more Darwin-scepticism (...which is indeed what followed after the release of it).

So he warned potential readers...

Pathetic !; Thursday, October 22, 2015 10:30:00 AM
Eelco van Kampen said...: Wrong again: it was not a warning not to buy the book (I couldn't care less).

It was a helpful warning about the category, as the bookseller listed it as a science book, which of course it isn't.

Have you actually read my warning ?; Thursday, October 22, 2015 10:41:00 AM
Piotr Gąsiorowski said...: There has been a small misunderstanding. JOC (the Journal of Organic Chemistry) is a peer-reviewed scienve journal (IF=4.721). JoC (the Journal of Creation) is an cargo-cult imitation of a journal (IF=0.000). Not a very convincing imitation, but it has bright colours and a video on its website.; Thursday, October 22, 2015 11:07:00 AM
Eelco van Kampen said...: JoC seems cursed, though - here is another one of those: http://journalofcosmology.com/; Thursday, October 22, 2015 11:13:00 AM
Mikkel Rumraket Rasmussen said...: Peer, it is time to take your meds. Please let the nearest adult handler know your "condition" has been making you make a mess on the internet again.; Thursday, October 22, 2015 11:14:00 AM
Anonymous said...: Transforming the errors per genome into a percent is a distraction. Given an error rate of 1/30,000,000 bp, How many errors per genome should we expect if the genome is 100,000 bp long? 50,000 bp? How big do you think genomes would have been in primitive life forms? How big would the populations be if primitive life forms were mostly self-replicators? Would population sizes have anything to do with whether a genome "melts down" or not? What about the amount of such genomes that has functions?

How's that working for you now?

Frontloading is self-delusion, not a scientific theory.

I suspect that you think that the theory of evolution proposes that everything evolved "gradually," that such thing means from absolute zero to whatever we have now, that you think that evolutionary theory proposes that everything, even replication fidelity, evolves independently from nothing for each species. I suspect that you think one way about what evolutionary theory is one second, a very different one the next second. Typical of creationists. I might be wrong, but the "simplicity" of your "argument" shows that you might have some, if not all, of these assumptions in mind.; Thursday, October 22, 2015 11:57:00 AM
judmarc said...: Frontloading is the new scientific theory from which we can understand biology.

But apparently not grade school arithmetic.

Let us say in accordance with your, ehm, "theory," that we have bacteria whose genomes must contain not only all the information the bacteria themselves need to function, but all the "frontloaded" information specifying all the species to come. As we go through the species, we need progressively less and less frontloading, until we come to humans, who need least of all. If we are to believe you, humans have no junk DNA. Thus bacteria would need genomes much, much larger than humans. Therefore we look at genome size and - oh, whoops!

My condolences on the sad demise of your theory.; Thursday, October 22, 2015 12:56:00 PM
Anonymous said...: Larry said
"Unfortunately, these are the exceptions, not the rule"
Really? The majority of scientific papers on this topic are poor?

Decades ago when I was an undergrad to took a grad class on bad papers. Most of those dealt with technique. For example a paper that claimed to show DNA in peroxisomes neglected to do a simple DNase control. I'm thinking now that one could do such a class but expanded to cover bad scholarship and poor writing. Sometimes one can learn more about a process by studying bad examples rather than brilliant ones.; Thursday, October 22, 2015 1:05:00 PM
Anonymous said...: Photosynth said
" How big do you think genomes would have been in primitive life forms? How big would the populations be if primitive life forms were mostly self-replicators? Would population sizes have anything to do with whether a genome "melts down" or not? What about the amount of such genomes that has functions?"

I think more important than all of that is whether the primitive life forms would have had recombination. Then they could have recombined out bad mutations from a lineage.; Thursday, October 22, 2015 1:26:00 PM
Anonymous said...: I really don't think recombination would have been important in early life forms. Replication would sometimes lead to exact copies, sometimes different copies that still worked, and sometimes different copies that didn't work. Those last would die (or fail to reproduce), weeding out bad mutations from the lineage.; Thursday, October 22, 2015 3:05:00 PM
Diogenes said...: Journal of Creation will be a "peer-reviewed science journal" when monkeys fly out my ass. Name one discovery, just one, first published in that tripe.

Like all creationist journals, they state outright that they will not publish articles that challenge the "creation model." All articles are thus organized around a hypothesis they seek to make non-falsifiable. Nope, not science.; Thursday, October 22, 2015 4:47:00 PM
Diogenes said...: No Peer, it's really very simple. For the earliest self-replicators, being small and simple, could have huge population sizes and reproduce very quickly. With huge population sizes and super-fast replication speeds, super-high copy fidelity would not be needed.; Thursday, October 22, 2015 4:50:00 PM
Diogenes said...: Larry, wonderful post, well-written. I'm just sorry that you've been forced to write basically the same post over and over criticizing terrible post-ENCODE papers.; Thursday, October 22, 2015 4:52:00 PM
Diogenes said...: No Peer, it's really very simple. For the earliest self-replicators, being small and simple, could have huge population sizes and reproduce very quickly. With huge population sizes and super-fast replication speeds, super-high copy fidelity would not be needed.; Thursday, October 22, 2015 4:52:00 PM
steve oberski said...: Hey Peer,

For what it's worth, I wouldn't have bought the book, with or without a "review".

This is based on your reputation as an IDiot troll, as you have so amply demonstrated and continue to demonstrate.; Thursday, October 22, 2015 6:04:00 PM
Anonymous said...: Diogenes,
It seems to me that replication speed, population size and number of progeny is not relevant for this problem, that just leads to more copying errors faster. I think the key is how tolerant the early replicators were to mutation - how large was the functional space for those replicators- and whether they could recombine, which is don't think is far-fetched in an RNA world. If they could recombine there would always be some lineages that would recombine out harmful mutations, no matter how high the mutation rate.; Thursday, October 22, 2015 6:52:00 PM
Anonymous said...: Iantog,

It seems to me that replication speed, population size and number of progeny is not relevant for this problem, that just leads to more copying errors faster.

Nope. I agree that tolerance to mutations, functional/sequence space, and recombination are important, but recombination becomes important mostly after selection purges out a lot of the crap (recombining damaged genomes with advantageous ones might not be very productive). Population genetics theory helps a lot understanding why population size is an important factor, besides genome size, etc etc.; Friday, October 23, 2015 12:27:00 AM
The Other Jim said...: Also, in a small "genome", back-mutation of specific sites would become more common.; Friday, October 23, 2015 4:43:00 AM
Eelco van Kampen said...: Worse - 'Journal of Creation' has a clear 'statement of faith' that the editors (and thus the authors) need to adhere to (http://creation.com/journal-of-creation-writing-guidelines ). That makes it a religious publication, not a science journal.; Friday, October 23, 2015 5:10:00 AM
peer said...: "Transforming the errors per genome into a percent is a distraction."

Of course not. It is a mathematical presentation of genome replication fidelity.

And because it does not serve the Darwinian-evolutionary paradigm it is distraction.

Copying Fidelity can simply be presented as errors/total sequence/generration. It depens heavily upon proofreading- en repairmechanims and CF will decrease with decreasing proofreading and repair enzymes.

CF must have been almost perfect from the start.; Friday, October 23, 2015 6:20:00 AM
peer said...: For the earliest self-replicators, being small and simple, could have huge population sizes and reproduce very quickly. With huge population sizes and super-fast replication speeds, super-high copy fidelity would not be needed.

We can summarize this as: blablabla. Non-science. Not a skerrick of scientific evidence for this blablabla. Are you as scientist sticking to scientific facts? Surely not. Just blabla.; Friday, October 23, 2015 6:22:00 AM
peer said...: "Let us say in accordance with your, ehm, "theory," that we have bacteria whose genomes must contain not only all the information the bacteria themselves need to function, but all the "frontloaded" information specifying all the species to come. As we go through the species, we need progressively less and less frontloading, until we come to humans, who need least of all. If we are to believe you, humans have no junk DNA. Thus bacteria would need genomes much, much larger than humans. Therefore we look at genome size and - oh, whoops!"

Bacteria strive for simple and streamlined genomes. Loss of infromation and unused genes is what is evident from Lenskis experiments. Frontloading predicts this.

Further, you describe the general frontloading theory. I personally stoch to the special frontloading theory, which holds independent origin of several forms of life.

The hypothetical LUCA is also perfectly in accord with frontloading, having the same genes manyfold over (Whitfield, “Origins of life: Born in a watery commune,” Nature 427, 674 — 676 (19 February 2004).

Fontloading is the only viable evolutionary theory.; Friday, October 23, 2015 6:30:00 AM
peer said...: Furthermore, as evident from PCR experiments, the most simple replicator will win the reproduction race for survival, not the more complex.

Do you even understand the primary principle of your beliefsystem?; Friday, October 23, 2015 6:33:00 AM
AllanMiller said...: I've never quite understood the 'meltdown' argument on the grand scale. It seems to assume that the only things that happen are deleterious mutations and Muller's ratchet. If replicators continue to be produced, selection will deal with the lineages that aren't very good at it, leaving the rest. If (say) 5 mutations is a threshold, and the world has ratcheted up to be full of 4-mutation individuals, then in that genomic context, a 5th is lethal, when it would have been merely detrimental had it happened sooner. So ... the world belongs to those that don't produce that lethal mutation, or compensate in some other way.

A mechanism that can extinguish one lineage does not have the power to extinguish them all assuming the replication exponent was ever >1 - it's not as if we'd ever be likely to get to a stage where every mutation in every lineage is lethal.; Friday, October 23, 2015 6:49:00 AM
peer said...: "selection purges out a lot of the crap (recombining damaged genomes with advantageous ones might not be very productive"

...without a nearly perfect CF, you cannot talk about genomes. Genomes depend on a nearly perfect CF.

Why don't you guys set up an experiment, instead of all the blabla.

Take a plasmid containing one functional gene. Amplify it using PCR with and without proofreading enzymes...

Check the functionality of the gene after every tenth amplification round.

Surely, the one without proofreading degenerates with the speed of light.

The other less fast.

You may as well do the reverse. Take a plasmid containing a nonsense code...and amplify it in a PCR. Will we ever observe any functionality of the nonsense gene?; Friday, October 23, 2015 6:50:00 AM
AllanMiller said...: Furthermore, as evident from PCR experiments, the most simple replicator will win the reproduction race for survival, not the more complex.

In a selective environment that rewards shorter genomes, shorter genomes will have the advantage. So what?; Friday, October 23, 2015 6:55:00 AM
AllanMiller said...: Take a plasmid containing one functional gene. Amplify it using PCR with and without proofreading enzymes...

Check the functionality of the gene after every tenth amplification round.

So you completely remove selection? You think that's a valid protocol?; Friday, October 23, 2015 6:57:00 AM
peer said...: Allen Miller,

selective extinguishing of lineages loaded with slight deleterious mutations is known as truncated selection. It does not work, because the input of nearly neutral mutations far exceeds the power of truncated selection.
The point is also redundancy, in particular in 2n slow reproducing organims, so that selection simply cannot see the nearly neutral mutations (compensation).

You must believe that selection is a sort of god (it is a God-substitute indeed), having the power to do everything and anything. For a belief that is okay, however. But as scientsits we would like to have some proof. Isn't it?

The science evidence shows that selection can mitigate the decay of genomes. It is like copying a page from a book. After several round of making copies of copies, the text will become less and less readible. Of course you are allowed to pick out (select) the best readible copy and start again making copies using this one as the original, but eventually copies of copies will be not readible.

Loss of information through random mutations is a law of life.

It is so easy to understand. Only those commited to the selectiongod will be blind for the obvious.

Anyway...it's your life.; Friday, October 23, 2015 7:09:00 AM
peer said...: "So you completely remove selection? You think that's a valid protocol?"

We were discussing copying fidelity (CF). Yes, for CF this is a valid protocol.

Selection has to be determined in another way. Such as what I proposed on making copies of copies. Select the best and than make again copies of copies. Select the best, etc. It slows down genomic entropy, the loss of information.; Friday, October 23, 2015 7:15:00 AM
peer said...: Larry, wonderful post, well-written. I'm just sorry that you've been forced to write basically the same post over and over criticizing terrible post-ENCODE papers.

I get your point. Before ENCODE there was hope for the Darwinians and a belief for the atheist. After ENCODE there is no hope for Darwinians and the atheists belief system is shattered.

Or as Graur said: If ENCODE is true, then evolution is wrong.

That is what it is all about. A belief system.; Friday, October 23, 2015 7:24:00 AM
peer said...: "In a selective environment that rewards shorter genomes, shorter genomes will have the advantage. So what?"

Apparently, you do not understand that selection is nothing but differential reproduction?

The entity that leaves the most copies will always make up the population. This is another law of living systems. It does not tell us anything about information, neither about complexity.

Only if you link an increase of information to increased reproduction, Darwinian evolution is possible. Biology shows us the opposite. Increased complexy is linked to reduced reproduction.

Get it? Nothing in Darwinism makes sense in the light of biology.; Friday, October 23, 2015 7:31:00 AM
Ed said...: Peer says:
"I personally stoch to the special frontloading theory, which holds independent origin of several forms of life."

LOL Special frontloading... now the religious monkey comes out of the sleeve, special creation of humans obviously. YEC meets ID.; Friday, October 23, 2015 7:42:00 AM
AllanMiller said...: You must believe that selection is a sort of god (it is a God-substitute indeed), having the power to do everything and anything. For a belief that is okay, however. But as scientsits we would like to have some proof. Isn't it?

Don't be silly. Selection is differential reproduction correlated with genotype. How much proof of that do you need?

Selection will remove any lineage that exceeds some imaginary 'meltdown' threshold. That leaves all the rest. There is no case for saying that all lineages everywhere should wink out.; Friday, October 23, 2015 8:03:00 AM
AllanMiller said...: "So you completely remove selection? You think that's a valid protocol?"

We were discussing copying fidelity (CF). Yes, for CF this is a valid protocol.

So you can show that non-faithful replication without selection produces an excess of non-faithful copies? Stunning. Get on the phone to Nature; they are gagging for material of this quality.; Friday, October 23, 2015 8:07:00 AM
judmarc said...: Bacteria strive for simple and streamlined genomes.

Do they wear t-shirts and spandex while they "strive"?

Further, you describe the general frontloading theory. I personally stoch to the special frontloading theory, which holds independent origin of several forms of life.

Ah, of course - independent origin, so there is no necessity for frontloading!...oh, wait.; Friday, October 23, 2015 8:07:00 AM
AllanMiller said...: "In a selective environment that rewards shorter genomes, shorter genomes will have the advantage. So what?"

Apparently, you do not understand that selection is nothing but differential reproduction?

What makes you think I don't appreciate that fact? In an environment where shorter genomes produce more descendants ... guess what will happen?; Friday, October 23, 2015 8:09:00 AM
AllanMiller said...: Loss of infromation and unused genes is what is evident from Lenskis experiments. Frontloading predicts this.

So gradually, the frontloaded genome has been shorn of its information to make modern [insert arbitrary clade here]? Must have had a sodding big genome once. Easily outcompeted by its striving descendants.; Friday, October 23, 2015 8:21:00 AM
Eelco van Kampen said...: Of course Mikkel does have a clue. You can't even name your great great great great great great great great great great great great great great grandfather, so what about a million times 'great' ?

And indeed: to date, no fossil has been identified as a potential candidate for the CHLCA. Not surprising, and a pity too, of course.; Friday, October 23, 2015 8:27:00 AM
peer said...: The frontloaded genomes have by now all fallen apert is several different species. Check out the Cichlids in the African Rift Valley lakes...

About 12000 years the Vicoria lake was dried up. Within 6000 generations 500 species emerged...filing all niches, from vegetarions to parasites to predators. New genetic information is not involved. We only need a novel regulatory context of preexsiting information.

That is explained by frontloading theory. There is no need for millions of years Darwinian selection and acumulation of genetic noise. Darwinian evolution of novel codes is not even possible within such time frames.

Frontloading will soon be the new standard.; Friday, October 23, 2015 9:03:00 AM
peer said...: "Must have had a sodding big genome once. Easily outcompeted by its striving descendants. "

Ask Moran about the onion genome...; Friday, October 23, 2015 9:05:00 AM
peer said...: "LOL Special frontloading... now the religious monkey comes out of the sleeve, special creation of humans obviously. YEC meets ID."

Ed, I was refering to "general" and "special frontloading" in the same way as "general" and "special relativity"...

Considering the biological facts, special frontloading better suits the observations.; Friday, October 23, 2015 9:11:00 AM
Eelco van Kampen said...: "Further, you describe the general frontloading theory. I personally stoch to the special frontloading theory, which holds independent origin of several forms of life."

and

"Ed, I was refering to "general" and "special frontloading" in the same way as "general" and "special relativity"..."

This is getting sillier by the day ... how on Earth would that be 'in the same way' ???

Please elaborate - it's Friday, so you are allowed to make a fool of yourself (again). You do that so well, after all.; Friday, October 23, 2015 9:21:00 AM
Ed said...: "Ed, I was refering to "general" and "special frontloading" in the same way as "general" and "special relativity"..."

Yes, tbh this explanation really doesn't surprise me. With your bloated ego, it's no wonder you like to identify yourself with the BIG names in science. General and special frontloading... sjeesh.
The publisher of your book in the Netherlands has this banner on it's site where they compare you with Einstein and Darwin. The banner is about an campaign on 'fair science', i.e. schools should be allowed to teach (drums please)... creationism instead of evolution science.( I bet you didn't see that one coming!)
So yeah, it doesn't surprise me that your ego is bloated as it is, and your religious agenda was also very obvious.; Friday, October 23, 2015 11:02:00 AM
Anonymous said...: Did you notice guys? "Peer" ignored everything explained to him just to keep believing that he's right. He's a self-deluded fool. Nothing will make him listen because he doesn't want to.; Friday, October 23, 2015 11:27:00 AM
Diogenes said...: Peer, some people here *worked for* or advised ENCODE. I know people who advised ENCODE. Our own Georgi Marinov, to name one, *works for* ENCODE and he's the first to admit that ENCODE got it wrong: that the claim of 80% functionality was false by every pre-existing definition of "function."

Graur was right: if ENCODE is true, evolution is wrong. But people who *work for* or advised ENCODE acknowledge that ENCODE's functionality claim was wrong.

Should I believe them or a nobody religious fanatic pushing creationism, which the publisher of your book in the Netherlands says is what you're selling?

(Georgi might add two caveats: the ENCODE database might be useful to scientists somewhere down the road, even if the 80% functionality was bullshit; and the 80% would be true if we redefine "function" = interacts with any molecule at any time.)

Your creationism never had any scientific hopes to begin with. All the hopes of creationism were pinned on religiously-motivated political successes. Everything "scientific" in IDcreationism was fraud, hoaxes and non-falsifiable statements like "special theory of front leading."; Friday, October 23, 2015 12:06:00 PM

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)

Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)

Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Peter Atkins

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake. Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory. Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change. Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance. Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change. Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat. Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is True

I once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000

It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma

One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick

There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner

An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins

Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod

The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.

Richard Lewontin

More Recent Comments

Wednesday, October 21, 2015

The quality of the modern scientific literature leaves much to be desired

60 comments :