More Recent Comments

Friday, August 07, 2020

Alan McHughen defends his views on junk DNA

Alan McHughen is the author of a recently published book titled DNA Demystified. I took issue with his stance on junk DNA [More misconceptions about junk DNA - what are we doing wrong?] and he has kindly replied to my email message. Here's what he said ...
I wrote DNA Demystified with the knowledge and intent for spurring debate and discussion on a number of issues.

My position on 'junk DNA' hasn't changed much since I first learned about it in the early to mid- 1970s. My primary concern now is that the term 'junk' is inappropriate, as it conveys an immediate negative image and engenders an emotional response. There may well be DNA sequences that serve no useful purpose and are only wastage, carried along through the generations as burdensome baggage (i.e. the 'ordinary' definition of 'junk'). Initially. as I'm sure you remember, all non-coding DNA was considered (by some) as "Junk DNA". I was never among them, expecting that eventually scientists would find some adaptive value to at least some of the non-coding sequences. I am happy to accept the data that this has come to pass-- that the now well-documented regulatory functions alone, for example, justify trashing the 'junk' label.

If there are tracts of truly useless DNA, it would be interesting to see how the organism responds when such sequences are deleted from the genome. That would be a true test of whether or not the excised DNA sequences were 'junk'.

You are free to disagree, of course, but I wanted to clarify my position.
Alan McHughen appears to dislike the term "junk" DNA because of its negative image and also because he thinks that the original definition has been disproved.

I don't want to discuss the first point because it's a red herring. As far as I can tell, the only people who dislike the word "junk" are doing so because they don't believe it's an accurate description of a substantial part of our genome. So let's just discuss the second point.

If I understand him correctly, his second point is that the term "junk" DNA was originally synonymous with "noncoding" DNA and, as he explains in his book, the scientists who used the word "junk" did so because they thought that all noncoding DNA was useless. His position now is that some noncoding DNA has been shown to be functional thereby refuting the original definition.

Let me remind readers of what he wrote in his book.
When it was first discovered, the nongenic DNA was sometimes called—somewhat derisively by people who didn't know better—"junk DNA" because it had no obvious utility, and they foolishly assumed that if it wasn't carrying coding information it must be useless trash.
My position is that there was never a time when knowledgeable scientists ever said that all noncoding DNA was junk. They never assumed that the only functional sequences in our genome were protein-coding sequences. Junk DNA was always defined as excess DNA that had no function and that definition is still valid.

Alan McHughen and I are from the same era but we clearly hung out with different crowds. My mentors were members of the 'phage group who were actively working on genes and their regulation and actively investigating other functional elements. I attended summer meetings at Cold Spring Harbor for five years (1969-73) and I can assure you that anyone who stood up in front of that group and said that all noncoding DNA was junk would have been laughed out of the room.

Here's what I knew in the early 1970s.
  • Some genes did not encode proteins. Ribosomal RNA genes and tRNA genes were discussed in the first edition of Watson's textboook in 1965. We all knew about these functional noncoding sequences.
  • Regulatory sequences such as promoters and operators controlled the expression of genes. The noncoding regulatory sequences of the lac operon and of the major operons of bacteriophage lambda were well known. Nobody ever thought that these noncoding regions were junk.
  • We knew about centromeres—noncoding functional DNA.
  • We knew about origins of replication—noncoding functional DNA. (I was working on DNA replication.)
Now, I'm not denying that there might have been scientists who didn't know these things and I'm not denying that some of them might have foolishly thought that all noncoding DNA was junk. These scientists may have been part of the group that Alan McHuthen knew in the 1970s but that group did not define junk DNA. They were not the experts.

Let's look at the 1972 paper by Susumu Ohno because that's the paper that made the term "junk DNA" popular. Ohno was an evolutionary biologist and a molecular geneticist and he was familiar with the thinking of the scientists in the 'phage group. He begins his paper by referring to the C-value paradox because that's an important part of the early thinking about junk DNA. Why do some species have a lot more DNA than others? ... it's because the excess DNA is junk. That's still the only reasonable explanation of the so-called C-value Paradox.

Ohno then discusses the genetic load argument by pointing out that we can only have about 30,000 genes or our species would go extinct. He estimates that only about 6% of our genome could be functional and references Kimura and Ohta's seminal paper on mutation rates and effective population sizes (Kimura and Ohta, 1971). He then says ....
Aside from conventional structural genes and regulatory genes, this 6% should include the promoter and operator region which are situated adjacent to each structural gene, for these regions can definitely sustain deleterious mutations. [His emphasis.]
Ohno did NOT think that all noncoding DNA was junk and neither did anyone else who knew what they were talking about. Ohno, and many others, knew perfectly well that regulatory sequences exist and that they are not junk. These experts did not foolishly assume "that if it wasn't carrying coding information then it must be useless trash."

So, I do not agree with Alan McHughen that the original definition of junk DNA equated it with noncoding DNA and I do not agree with him that the discovery of regulatory sequences "justify trashing the 'junk' label." I still think the genetic load argument has to be dealt with by opponents of junk DNA.

I'm still not exactly sure where the revisionist history comes from. Perhaps someone can help me out by coming up with a reference from the 1970s where some knowledgeable scientist makes the point that all noncoding DNA must be junk.

Now let's move on to 2020. There are a large number of experts who think that most of our genome is junk. I'd like to ask Alan how he deals with the evidence for junk DNA and what evidence he can offer to support the claim that most of our genome is functional.

Here's a paper from my friends Alex Palazzo and Ryan Gregory (Palazzo and Gregory, 2017) and another one from Ford Doolittle and Tyler Brunet (Doolittle and Brunet, 2017). They are good starting points for further discussion.
Palazzo, A.F. and Gregory, T.R. (2014) The Case for Junk DNA PLOS Genetics 10:e1004351. [doi: 10.1371/journal.pgen.1004351]

With the advent of deep sequencing technologies and the ability to analyze whole genome sequences and transcriptomes, there has been a growing interest in exploring putative functions of the very large fraction of the genome that is commonly referred to as “junk DNA.” Whereas this is an issue of considerable importance in genome biology, there is an unfortunate tendency for researchers and science writers to proclaim the demise of junk DNA on a regular basis without properly addressing some of the fundamental issues that first led to the rise of the concept. In this review, we provide an overview of the major arguments that have been presented in support of the notion that a large portion of most eukaryotic genomes lacks an organism-level function. Some of these are based on observations or basic genetic principles that are decades old, whereas others stem from new knowledge regarding molecular processes such as transcription and gene regulation.

Doolittle, W.F. and Brunet, T.D. (2017) On causal roles and selected effects: our genome is mostly junk BMC biology 15:116. [doi: 10.1186/s12915-017-0460-9]

The idea that much of our genome is irrelevant to fitness—is not the product of positive natural selection at the organismal level—remains viable. Claims to the contrary, and specifically that the notion of “junk DNA” should be abandoned, are based on conflating meanings of the word “function”. Recent estimates suggest that perhaps 90% of our DNA, though biochemically active, does not contribute to fitness in any sequence-dependent way, and possibly in no way at all. Comparisons to vertebrates with much larger and smaller genomes (the lungfish and the pufferfish) strongly align with such a conclusion, as they have done for the last half-century.

Kimura, M. and Ohta, T. (1971) Protein polymorphism as a phase of molecular evolution Nature 229:467-469. [doi: 10.1038/229467a0]

104 comments :

João said...

The more I see the myth ("there is no junk DNA") popping up everywhere, the more strongly I believe your book needs to be published.

I've talked to a lot of biologists here in Brazil that repeat the motto "all non-coding DNA was thought to be junk but modern science has shown that...". And I'm talking about professionals. It's really weird. I am no expert on the field (although I'm myself a biologist), but it feels like junk is almost inevitable to exist if conditions are met.

However, I believe I came to think more positively towards Junk DNA mostly because of your blog. So that's why I think we need a book such as yours published and widely available.

Mikkel Rumraket Rasmussen said...

I have to say it's rather ironic that the standard story, as told by Alan McHughen, is that non-coding DNA was immediately dismissed as junk, when that very same author insists he always thought it was functional. But then it simply isn't true to say it was immediately dismissed as junk! "Some" did(who?), some didn't. But then the story is false, as many people didn't immediately dismiss it as junk-DNA. And there appears to be no-one at all who have kept insisting that it's all junk.

In fact, I'd like to see if anyone can actually name a single person, with a reference and a quote, who really did think all non-coding DNA was junk. Or much more difficult, that this view somehow grew to be the standard majority view in genetics that needed debunking.

SPARC said...

Ironically, the first non-coding sequences were identified based on their function. Long before DNA sequencing was even available. Thus, there was truly no reason to consider them as junk.

Don Cates said...

Maybe if you check the non-biologists (at least the non-evolutionary biologists). Could start with the physicists?

Mark Sturtevant said...

Funny, but I just came across an article about how many people remember a '90s movie called "Shazaam!" , starring Sinbad. But this movie actually never existed. It is a nice example of how people can form a false memory, and in this case how that can happen en masse through exposure to media. This notion that people once thought most ncDNA was junk also seems like a widely held false memory. But it never happened!

What do people think about where we are at about this Junk DNA debate? I've read here and elsewhere that 'most biologists' think that most of our eukaryote genome is mysteriously functional, and that the 'its mostly junk' model is false. But I sound out plenty of biologists and not a single one thinks that at all. They all understand pretty well that most DNA seems to be selectively neutral and without discernible function. Junk. And they are well aware of the ENCODE debacle and are plenty pissed about it.

Finally, there should be no problem with calling this DNA 'junk'. It seems more edgy and colorful rather than negative and emotional. And anyone should know that sometimes junk gets put to use.

Don Cates said...

Yes, a much more accurate term than the alternative "garbage".

Joe Felsenstein said...

A little testimony: I was in graduate school in Dick Lewontin's lab in 1964-1967. In that era it was clear from measurements of the amount of DNA that there was too much DNA to be accounted for by coding sequences. Remember that then we did not know about introns. The wisest heads in molecular biology may have privately discussed the likelihood that this extra DNA was junk, but we did not hear that discussion and it was just a mystery to us. After 1969, with Ohno publicizing the concept of junk DNA, and with Motoo Kimura arguing that much DNA substitution was neutral, we were surprised that there could be so much junk DNA. At first most population geneticists, and almost all other evolutionary biologists, assumed that natural selection would quickly discard useless DNA. But we were of course already aware of deleterious mutants in the genome that were maintained by a balance between deleterious mutation and purifying selection. It took some years of learning about processes such as transposons jumping around and formation of pseudogene copies before we appreciated the rate of formation of junk DNA and the slowness with which it was removed from the genome. By the 1980s the panselectionism that made people resist acknowledging the existence and importance of junk DNA was beginning to fade away (Alan McHughen seems to be a holdout). Of course all along geneticists of all sorts knew that there were functional sequences that were not protein-coding loci -- it was hard to imagine that those accounted for all of the extra DNA. Obviously molecular biologists has different reactions, and they did not appreciate the relative strengths of the forces creating and removing junk DNA, so they continue to resist acknowledging its importance. I wonder whether the prospect of applying for lots and lots of grants to work out all this mysterious new function tempts them to consider all of the genome functional.

Joe Felsenstein said...

typo: "obviously molecular biologists have different reactions ..."

The Lorax said...

I would not throw molecular biologists under the bus here, as I would argue molecular biologists, along with biochemists, and geneticists were key in discovering introns, tRNA, promoters, etc. I would argue it is primarily mammalian geneticists (i.e. cancer biologists and the like) who refuse to read outside their fields.

Why do some plants need so much more DNA than humans? why do some eukaryotic microbes need so much more DNA than humans? How do you explain the C-value paradox, especially among closely related species? All these questions basically reveal those who assume all biology is human biology.

John Harshman said...

That way lies the dog's-ass plot and madness.

SWoody said...

One minorquibble with your otherwise excellent exposition: genomic regions that encode tRNAs and rRNAs cannot be casually called "noncoding"regions, I'm pretty sure that is what's known as a tautology. I readily acknowledge that "non-protein coding regions" is clumsy, perhaps someone might coin a more preferable but non-exclusionary reference to "coding" regions.

Larry Moran said...

I send Alan McHughen a link to this post and I was hoping to engage him in further discussion. I've just heard back from him and, unfortunately, he's too busy with other projects right now so he won't be able to participate.

AllanMiller said...

@Swoody - I dunno; functional RNAs are just antisense transcripts; their sequence is a noncoded mirror of the complementary DNA strand, so I think 'noncoding region' would have to include them - slightly unsatisfactory perhaps, since they do generate a product.

John Harshman said...

"Non-genic" DNA would be a reasonable term if you don't like "non-coding" and if you adopt Larry's definition of "gene": a sequence that is transcribed to produce (with or without translation) a functional product. Genic DNA would still be a tiny percentage of the genome, and non-genic DNA would still not be synonymous with junk DNA.

Marcoli said...

@SWoody: I should think the term 'noncoding DNA' is a very broad term that refers to DNA that does not directly code for a functioning product. It should include DNA that has no function for the organism (discussed here as junk), and DNA that does have some sort of function but still does not directly code for RNA (centromeres, promoters, etc.).
Given this range, the term easily causes confusion.
Meanwhile, perhaps genes for tRNA and rRNA and the like should be referred to as non-translated genes.

AllanMiller said...

Hence the difficulty with the restrictive, molecular-biological definition of 'gene': one has to first accept that before 'non-genic' can cover any ground. I prefer 'gene' to cover the DNA sequence underlying a given character state. That is of course much more labile: a 'character state' in cladistics could simply be any raw sequence, while gross aspects of morphology do not map simply to sequence. That's the appeal of the 'gene-product' mapping: discrete identity. But there also exists discrete sequence with no product but a heritable effect on phenotype. Thus I'd wrestle the term back from the molecular biologists and into the world of inheritance from whence it came.

Not that I'm consistent - I'll cheerfully use the term 'intergenic'...

John Harshman said...

As has been pointed out, "coding" specifically refers to the genetic code and thus to DNA that gets translated. DNA doesn't code for RNA, as there's no code or translation involved, just transcription.

Darwin said...

Hola

Darwin said...

Hello Dr Moran I am a big follower of yours and I also think that we have very little DNA to function, I would like to ask you a favor, a creationist user named Sean D Pitman published in his blogg Design Detection about junk DNA and about the 2014 result of only 8 2% is functional

You could do an analysis in your next publication

Darwin said...

So what part of the human genome in general is actually functional? Well, this has been a hot topic of debate since 2012, when ENCODE scientists announced their estimate of ~ 80% functionality for the human genome (ENCODE: The human encyclopaedia, September 5, 2012). This initial estimate was strongly questioned and even mocked by numerous scientists (Link). Then a couple of years later, in 2014, researchers at the University of Oxford, UK, concluded that only about 8.2% of the human genome is made up of natural selection. The rest, they argued, is not functional (Rands, 2014). While the 80% figure seemed too high, the 8% figure also seems a bit low to several scientists (Link). Patrik D'haeseleer, a computational biologist at Lawrence Livermore National Laboratory, California, tweeted “only between 8% and 80% of the human genome is functional. I'm glad we solved that. "At the core of the problem are the different definitions of" function. "Erick Loomis, an epigeneticist at Imperial College London, tweeted: Maybe we should stop using 'functional' if we can't find a common definition. " (Link).

Darwin said...

Dr. John Greally, an epigeneticist at Yeshiva University Albert Einstein School of Medicine in New York City, argued that Rands et. Alabama. The paper "missed an opportunity to explore why certain sequences, especially those known as transcription factor binding sites, are under such low evolutionary pressure, despite presumably having important biological roles. Instead, the authors emphasized the alleged discrepancy with ENCODE. The article seems to be used as a stick with which to hammer the ENCODE project, not necessarily by the authors, but by others. " (Link)
This is all very interesting because it has been known for some time, as Dr. Greally points out, that non-conserved sequences (based on presumed evolutionary relationships) can still be functional. For example, Kellis (2014) argues that:
The lower bound estimate that 5% of the human genome has been under evolutionary restriction was based on the excess conservation observed in mammalian alignments relative to a neutral reference (typically ancestral repeats, small introns, or quadruple degenerate codon positions ). However, estimates incorporating alternative references, shape-based restriction, evolutionary rotation, or lineage-specific restriction suggest approximately two to three times more restriction than before (12-15%), and their binding could be even higher as correct different aspects of alignment-based overconstraint…. Although still weak in power, human population studies suggest that an additional 4-11% of the genome may be under lineage-specific restriction after specifically excluding protein-coding regions. "
This means that at the very least 16% to 26% of the genome is likely to be functionally limited to one degree or another. And of course this means that the probable deleterious mutation rate is at least four times higher than the Ud = 2.2 rate that Keightley suggested in 2012 (and some would argue even higher), that is, around 8.8 deleterious mutations per offspring. per generation. . This, of course, would imply a necessary reproduction rate of more than 13,200 pups per woman per generation (and a mortality rate of more than 99.99% per generation). (Link).

Darwin said...

So it's no wonder this is such a hot topic for neo-Darwinists. Much depends on which part of the human genome is actually functional. And, who knows, it may turn out that the "c-value conundrum" and the answer to why many functional elements within the human genome are not significantly limited by natural selection is due, at least in part, to various forms of redundancy of functional elements. In other words, while various sections of noncoding DNA may be functional, there may also be other redundant copies of these functional sequences within the genome. This would mean, of course, that any particular copy could sustain numerous mutations without natural selection being significantly aware of. Until functional redundancy is exhausted, there would be a significant functional deficit for the organism and a need for natural selection to step up. In fact, this same argument was used by Kellis et. to the. in 2014 :
"The approach can also miss items whose phenotypes occur only in rare cells or specific environmental contexts, or whose effects are too subtle to detect with current assays. Loss-of-function tests can also be buffered by functional redundancy, such that the double or triple alterations are necessary for phenotypic consequence. According to redundant, contextual, or subtle functions, deletion of large and highly conserved genomic segments sometimes does not have a discernible organism phenotype and apparently debilitating mutations have been found in genes that are considered indispensable in the human population. "
So clearly this observation significantly undermines the impact of the Rands et. al., because Rands's conclusion of overall human genomic functionality of only 8.2% is entirely based on "restricted" sequence homologies between different mammalian species. Rands does not take into account the possibility of functional aspects of DNA that would not be significantly limited between or even within various species. In fact, it has been generally known for some time that there is a lot of redundancy in the human genome, as most genes and other functional genetic elements have at least two copies within the genome, and some have several dozen or more even several. one hundred copies. The human genome is in fact a "repeating landscape". Of course, some biologists considered the repetition superfluous or a kind of "back-up supply" of DNA (Link, Link). Genetic redundancy is the key to the robustness of organisms, that is, their built-in flexibility to quickly adapt to different environments. It is also in line with very good design. Consider the arguments of David Stern (HHMI researcher) in this regard:

Darwin said...

So it's no wonder this is such a hot topic for neo-Darwinists. Much depends on which part of the human genome is actually functional. And, who knows, it may turn out that the "c-value conundrum" and the answer to why many functional elements within the human genome are not significantly limited by natural selection is due, at least in part, to various forms of redundancy of functional elements. In other words, while various sections of noncoding DNA may be functional, there may also be other redundant copies of these functional sequences within the genome. This would mean, of course, that any particular copy could sustain numerous mutations without natural selection being significantly aware of. Until functional redundancy is exhausted, there would be a significant functional deficit for the organism and a need for natural selection to step up. In fact, this same argument was used by Kellis et. to the. in 2014 :
"The approach can also miss items whose phenotypes occur only in rare cells or specific environmental contexts, or whose effects are too subtle to detect with current assays. Loss-of-function tests can also be buffered by functional redundancy, such that the double or triple alterations are necessary for phenotypic consequence. According to redundant, contextual, or subtle functions, deletion of large and highly conserved genomic segments sometimes does not have a discernible organism phenotype and apparently debilitating mutations have been found in genes that are considered indispensable in the human population. "
So clearly this observation significantly undermines the impact of the Rands et. al., because Rands's conclusion of overall human genomic functionality of only 8.2% is entirely based on "restricted" sequence homologies between different mammalian species. Rands does not take into account the possibility of functional aspects of DNA that would not be significantly limited between or even within various species. In fact, it has been generally known for some time that there is a lot of redundancy in the human genome, as most genes and other functional genetic elements have at least two copies within the genome, and some have several dozen or more even several. one hundred copies. The human genome is in fact a "repeating landscape". Of course, some biologists considered the repetition superfluous or a kind of "back-up supply" of DNA (Link, Link). Genetic redundancy is the key to the robustness of organisms, that is, their built-in flexibility to quickly adapt to different environments. It is also in line with very good design. Consider the arguments of David Stern (HHMI researcher) in this regard:

Darwin said...

"Over the past 10 to 20 years, research has shown that regions of instruction outside of the protein coding region are important in regulating when genes are turned on and off. Now we are discovering that the extra copies of these genetic instructions are important for maintaining stable gene function even in a variable environment, so that genes produce the correct output for organisms to develop normally. " (Frankel, et. Al., 2010).
For example, in 2008, Michael Levine of the University of California-Berkeley reported the discovery of secondary enhancers for a particular fruit fly gene that was much further away from previously discovered target genes and enhancers than found. located adjacent to the gene. "Levine's team called the seemingly redundant copies in distant genetic kingdoms" shadow enhancers, "and hypothesized that they might serve to ensure that genes are expressed normally, even if development is impaired. Factors that could inducing developmental alterations include environmental conditions, such as extreme temperatures and internal factors, such as mutations in other genes ".
So, Stern and his team put Levine's hypothesis to the test by studying a fruit fly gene that encodes the production of tiny hair-like projections on the insect's body, which are called trichomes. "The gene, known as shavenbaby, takes its name from the fact that flies with a mutated copy of the gene are nearly hairless. Stern previously led a research effort that identified three major enhancers for shavenbaby. In the new research, his team discovered two shadow enhancers for shavenbaby, located more than 50,000 base pairs from the gene.
In their experiments, the researchers removed these two shadow enhancers, leaving the primary enhancers in place, and observed the development of fly embryos under a range of temperature conditions. At optimal temperatures for fruit fly development, around 25 degrees Celsius, or a comfortable 77 degrees Fahrenheit, the embryos without shade enhancers had only very slight defects in their trichomes. But the results were very different when the researchers looked at embryos that developed at temperatures close to the extremes in which developing fruit flies can survive: 17 degrees Celsius, or 63 degrees Fahrenheit, at the low end and 32 degrees Celsius. , or 90 degrees Fahrenheit. , at the upper limit. These flies without shade enhancers developed with serious deficiencies in the amount of trichomes produced. "(Link)
"These results indicate that genetic instructions that seemed reliable at optimal temperatures were simply not up to the task under other conditions," Stern said. (Link)
"Back-up regulatory DNAs, also called shadow enhancers, ensure the reliable activities of essential genes like shavenbaby, even under harsh conditions, such as increases in temperature," Levine said. "If Dr. Stern and his associates had not examined shavenbaby's activities under such conditions, then the shadow enhancers might have been lost, as they are not needed when fruit flies are grown under optimal growing conditions in the laboratory. ". (Link)

Darwin said...

Of course, the very existence of the genetic buffer and the functional redundancies necessary for it presents a paradox in light of evolutionary concepts. On the one hand, for genetic buffering to take place there is a need for redundancies of genetic function, on the other hand, such redundancies are clearly unstable against natural selection and therefore unlikely to be found in evolved genomes (Link). So why is there still so much genetic buffering within the genome? - What if natural selection does in fact destroy that buffer redundancy for relatively short periods of time? However, it exists. And this is just the tip of the iceberg. "The study of DNA and genetics is beginning to resemble particle physics. Scientists continually find new layers of organization and increasingly detailed relationships." (Link). Taking these findings into account, it is highly likely that the functionality of the human genome is well above 8.2%.

Darwin said...

those texts are Sean Pitman's post on junk DNA you could analyze them in your next post

Mikkel Rumraket Rasmussen said...

It's extremely poor reasoning and vague references to the mere possibility of funcitonal redundancy. Just to pick an example that shows how Sean Pitman's rationalizations don't actually constite evidence against junk-DNA, take this purported answer to the c-value paradox:

//And, who knows, it may turn out that the "c-value conundrum" and the answer to why many functional elements within the human genome are not significantly limited by natural selection is due, at least in part, to various forms of redundancy of functional elements.//

Let's first recall that the c-value paradox is about the huge variations in genome size between different, even closely related species. You can find species of onions that appear similar, but one has 5 times more total DNA than another (and incidentally, has more DNA than humans). How does "functional redundancy" make sense of this observation? Why would one species of onion NEED five times more functional redundancy than another, and why would it need more than humans? This question simply isn't answered at all by Sean's handwave.

And with respect to functional redundancy in general, Sean's article makes no attempt to show that the total size of any species' genome is explained by that redundancy, he doesn't show anywhere, the size of the putative redundant elements, and that they account for the excess genome size in any species at all. He simply waves his hand in the direction of some possibility, but does no work to show that proportion of the genome this redundancy accounts for.

In the end, his whole blog post is smoke and mirrors. There's nothing there in the end.

Darwin said...

Mike thank you very much for your answer but I have a question about the C value paradox. Some people who work in the field of genetics say that the very different sizes of the genome are due to polyploidy so is polyploidy an answer to the test of the onion?

Georgi Marinov said...

No, it's part of the answer in some cases. But primarily it is about expansion of repeats and other junk

Darwin said...

So it means that it can explain some genomes but not all because there are many animals that do not have polyploidy and their genomes are so different in sizes, and if we knew that all the different sizes are due to polyploidy that would not explain the value enigma either? C since according to the creationist all the repeats, transposons, retroviruses, pseudogenes are functional to 100%?

Georgi Marinov said...

Animals are a tiny branch on the tree of life, it is a cardinal mistake to think about biology in terms of animals.

Darwin said...

Could you give me a few links to ENCODE scientist critiques and I would like to know if it has been re-analyzed which part of the genome is functional

Larry Moran said...

Here's a link to my blog post that lists some of the critiques of ENCODE.

The truth about ENCODE

Here's a summary of the functional elements in our genome.

What's In Your Genome? - The Pie Chart

Darwin said...


Thank you very much Larry I would like to ask you a question

Does polyploidy explain the C-value conundrum or just a small part?

Is all the difference in the size of the genomes due to polyploidy?

Have junk DNA extractions been done in animals to test whether they are functional or essential?

John Harshman said...

Answers: Just a small part, no, on occasion and with ambiguous results.

Regarding polyploidy, you should look up "diploidization"; even where it may apply, the polyploidy explanation lasts only for a limited amount of evolutionary time. All vertebrates, for example, seem to have an octoploid origin (or perhaps two successive tetraploid events), but diploidization has long since rendered that irrelevant. Teleost fish, of course, all have a third tetraploid event in their pasts. And yet fugu still has a much smaller genome than the other vertebrates that lack this event.

Unknown said...

So does that mean that polyploidy is not a plausible answer to the C-value riddle?

John I would like to know on the other hand if you have time to answer me another question about the scourge and the IC argument

It is true that it has already been discovered that for example

The coagulation cascade according to Russell Doolittle works in lampreys even when 5 parts are missing as in dolphins etc and they still continue to form clots so it is not irreducibly complex

The immune system debunked on Behe's face by Eric Rotchield and the bundle of books taught to her

The Eye Debunked by Dan E Nilsson, Trevor Lamb, Fernand,

Now my doubt is in the bacterial flagellum

I saw Matzke's paper from 2003 where he proposed an evolution of the flagellar system from the type 3 secretory system but in 2012 it was discovered that T3SS came after the flagellum

Does that invalidate Nick's article?

Are all parts of the flagellum needed for motility or only 20 parts?

Does the argument against irreducible complexity still stand since the T3SS that evolved from the lost flagellum varies from its parts and still has another function?

John Harshman said...

So does that mean that polyploidy is not a plausible answer to the C-value riddle?
Yes, that's exactly what it means.

I don't have answers for your other irrelevant questions.

Unknown said...

I do not think it is irrelevant to talk about an argument that may defy evolution and I think it is necessary to make updates on the bacterial flagellum and an evolutionary pathway

Thank you very much for clarifying my question about polyploidy

John Harshman said...

Who are you really?

Darwin said...

My name is Javier Alvarez I am Cuban and I am 17 years old and when the pandemic ends I will go to a University of Biology

Darwin said...

Hi Larry I would like to ask you a question about chromosome 2 and Tomkins's argument

Is the DDX11L2 pseudogene found or missing at the fusion site?

John Harshman said...

Sadly, on the internet nobody knows you're a dog, and I have no way to confirm this. People have pretended to be quite different people from what they are on this very site. And creationists have pretended to be innocent people asking innocent questions on this very site, some of those questions similar to what you're asking. Thus I'm unfortunately suspicious of posts like yours. If you really are Javier Alvarez, I apologize, but you have any atmosphere of trust has been destroyed by some of those who preceded you here.

John Harshman said...

Neither, really. It's not found, but it's not missing. "Missing" would imply that it should be there. What's true is that some databases record a long transcript of the pseudogene that encompasses the fusion site. But that's almost certainly spurious transcription, which occurs throughout the genome. The fusion site isn't actually part of the pseudogene, which after all is a pseudogene.

Larry Moran said...

Here's a link to one of many sites showing that Tomkins doesn't know what he's talking about.

Chromosome 2 Fusion – Dead in a Day?

popgen wannabe said...

Here's just one fairly recent example where polyploidy is NOT the driver of genome size variation:
https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-019-5859-y

Small, but surprisingly repetitive genomes: transposon expansion and not polyploidy has driven a doubling in genome size in a metazoan species complex
J. Blommaert, S. Riss, B. Hecox-Lea, D. B. Mark Welch & C. P. Stelzer
BMC Genomics volume 20, Article number: 466 (2019)

Unknown said...

Thank you very much for the document a while ago I was looking for that

Unknown said...

John, you don't have to worry, I'm a critic of intelligent design and I admire them for their work teaching evolution and debunking the pseudoscience of the Discovery Institute and one of the people I admire the most and who has contributed the most to that task has been Larry Moran who It has shown that if there is junk DNA when ENCODE screwed up but above all who I admire the most is Nick Matzke who worked in the Dover trial discredited Jonathan Wells, Stephen Meyer and Michael Behe ​​in their books

Darwin said...

Dr Moran would like to ask you a question about probabilities

As I am not an expert in probability, could you explain to me if you have time what the work of Douglas Ax from 2004 means, which gave him a very high improbability of a functional protein

Larry Moran said...

What it means, in the simplest terms possible, is that Doug Axe is an Intelligent Design Creationist and like most of them he will use any argument to confirm his religious bias against evolution.

Quite a few experiments have been done to test the idea that random sequences can produce functional enzymes in spite of what Axe says. All of them show that a small percentage of random sequences have some sort of activity. If Axe were correct these experiments would never have succeeded.

Think about what Doug Axe is actually saying. He's saying that the probability of evolving a functional protein is so low that it could never have happened naturally. This must mean that some unknown designer synthesized every one of the 100,000 known proteins and placed their genes in various combinations in over ten million species. Anyone who thinks that this is a probable scenario must know nothing about biochemistry or evolution.

Darwin said...

Thank you very much for the answer Larry, it is that I am making a website about evolution and the typical arguments of creationists and how my area is philosophy and not mathematical probabilities I wanted to ask you that question

By the way I am collecting observable examples of speciation in nature and I already have 4, if you know some you could write it in the comments

The day that a large-scale speciation is observed that I am sure will happen one day will be the end of the creationist arguments against evolution

Lamarck said...

Hi,

sounds good. But problems with philosophy and mathematics/probability? Maybe Burk's "Chance, Cause and Reason" or Popper's "The Logic of Scientific Discovery" could be helpful.

As for creationism, here is some advice for beginners:

Never argue with an idiot. He pulls you down to his level and defeats you because of his years of experience.

And if you do want to interfere, refutations succeed via scientific theory explanations, but never via facts. If you tell a creationist something about speciation, he will answer you with the creationist catchword "microevolution" (= informal fallacy: "Moving the goalposts").

But regarding speciation as such, first look at the cichlids from the East African rift lakes.


Cheers,

Lamarck

Lamarck said...

Hi,

I am always surprised at how the term "function" is tripped up. Functional morphology is the view of a physician who has something to repair if necessary. Construction morphology is the "view" of evolution. There is no material object that cannot also be functional in some way (= anthropocentric view). Freely based on the motto: Nobody is useless, because he can still serve as a bad example.

This functional fallacy, incidentally, is also the basis of Behe's creationist argumentation. But something like a change of function can only be explained by construction, not by function.

In this context I would like to propose the following classification:

1st order non-junk DNA: Any section of a sequence that, in its entirety, has potential trans-sequential effects. In the narrower sense, the coding regions of the genome.
2nd order non-junk DNA: Any sequence segment which, in its entirety, has potential cis-sequential effects. This refers to all the sequence regions that had to play a rail shot constructively (AKA "control & regulation"). This includes everything that has to do with labels from gene switch to evo-devo.
1st order junk DNA : Everything that is not NJD1 or NJD2 and therefore "mute" (wrecks).
2nd order junk DNA : Everything that can be determined as a form of horizontal gene transfer and is "mute" (fragments). JD2 is necessary subset of JD1.

It is to be expected that in this view the proportion of JD in the vertebrate genome is in the order of 90%. The C-value paradox does not play a role here. In addition, it will not be possible to eliminate this 90%, since the genome is dependent on the corresponding "play money" (the principle applies: no cheese, no holes).

It seems that I have to write a paper about methodological weaknesses in the ENCODE project that have not been considered so far. Therefore I ask the group to criticize me.


Cheers,

Lamarck

Darwin said...

Thank you very much friend for the example and also macroevolution is defined as a change of species so if examples of speciation are found your argument against macroevolution disintegrates

John Harshman said...

Sure. Everything in that post was batshit crazy. That should be enough criticism for anyone.

Lamarck said...

Hi,

in the history of science, the term macroevolution comes from the attempt to grasp the change of higher order taxa, especially through the term "Bauplan". "Bauplan" is the German word for engineering drawing or blueprint (= Richard Owen's archetype - compare in this context the Cuvier-Geoffroy debate of 1830). Microevolution was the counterpart to this for problems of speciation. The term pair microevolution - macroevolution is originally a German invention. Today, however, evolution is evolution is evolution...

Creationists on the other hand claim that microevolution is evolution and limited to one basic type (= "baraminology") and that macroevolution cannot be observed. Two curious things here: (1) If evolution is true, there is exactly one basic type. (2) Creationists confuse microevolution with variation.

Of course, there are plenty of examples of species formation. A curious example is speciation around large cities:

The London Underground mosquito Culex molestus only exists in the London Underground, which was opened in 1863. Its closest relatives live above ground, live on bird blood and hibernate. The subway mosquito lives on commuter blood all year round and therefore only has to travel short distances. There are also the Bakerloo and Victoria lines, which differ significantly in their genetic characteristics. Since the mosquitoes never leave their tunnels, they would have to change at Oxford Circus to mate with the inhabitants of other lines.


Cheers,

Lamarck

Lamarck said...

Hi!

Oh c'mon, chicago duck! Your own annoyance at your lack of ability is no reason to commit rhetorical suicide, is it?

But back to the topic: Do you know why some duck species have a phallus, but many other duck species don’t?


Cheers,

Lamarck

John Harshman said...

No, you're wrong about that. All duck species have penises.

I gave a long reply to your previous post, but it disappeared into the ozone. Lacking the will to retype it, I substituted that short summary. I'm afraid you will have to figure out for yourself just why what you said was batshit crazy.

Who are you really?

Joe Felsenstein said...

Functional morphology is the view of a physician who has something to repair if necessary. Construction morphology is the "view" of evolution. There is no material object that cannot also be functional in some way (= anthropocentric view). Freely based on the motto: Nobody is useless, because he can still serve as a bad example.

This functional fallacy, incidentally, is also the basis of Behe's creationist argumentation. But something like a change of function can only be explained by construction, not by function.


To expand on John H's remarks: what does that quoted stuff mean, if anything? It is meaningless to me.

Champion Debater said...
This comment has been removed by the author.
Darwin said...


Dr. Moran, thank you very much for the answer to the document by Douglas Ax, it was very useful to me and I would like to tell you that today a video was released by a professor of evolutionary and genetic biology named Dan Cardinale on junk DNA debunking claims of functionality here I leave you the link to that you see it

https://youtu.be/9Sz559xJpPg

Lamarck said...

Hi, John Harshman,

correctio: Do you know why some duck species have a phallus, but many other avian species don’t?

It is only a small experiment on the subject of scientific thinking...


»I gave a long reply to your previous post, but it disappeared into the ozone. Lacking the will to retype it, I substituted that short summary. I'm afraid you will have to figure out for yourself just why what you said was batshit crazy.«

But of course. Do you know the Latin phrase hic rhodus, hic salta?


»Who are you really?«

This is just your fear in the dark, isn't it?


Cheers,

Lamarck

Lamarck said...

Hi Joe Felsenstein,

obviously everything here is about the notion of function:

»In 2012, the ENCODE collaborators published in essence a book of their findings—thirty article chapters in Nature and several other journals. The results generated almost as much controversy as the genome wars of the late 1990s. According to the ENCODE authors, fully 80 percent of the human genome should be defined as “functional,” meaning that it performs some sort of biochemical role, such as providing binding sites (“enhancers” and “promoters” in the geneticists’ lingo) for transcription factors or gene regulatory proteins that switch genes on and off. The project also shed light on poorly understood stretches of our DNA between genes that give rise to RNA transcripts; unlike messenger RNA, these, for whatever reason, are not translated into proteins. Far from jettisoning vast stretches of the genome sequence as mere junk, ENCODE argued that these tracts actually contained priceless heirlooms and artifacts like a proverbial lost trunk in the attic.

The ENCODE authors’ much-quoted 80 percent claim has been fiercely debated. Controversy hinges on the definition of “functional.” As one of ENCODE’s chief architects, Ewan Birney, explains, that issue boils down to this: does an element in the genome alter the biochemistry of the cell or change the phenotype (appearance) of the organism in some way? Evidence of transcription, in other words, suffices as evidence of functionality, though some would argue things are not so simple. For 8 percent of the genome, there is unequivocal evidence of physical contact between the DNA and proteins such as transcription factors. Admittedly, the function of the other 72 percent of our “functional” DNA is by no means understood. Still, ENCODE provides compelling evidence that even those sequences far removed from gene-coding regions regulate some aspect of gene expression. Birney points out that fully 60 percent of the genome is now categorized as exonic or intronic, “so seeing an additional 20 percent over this expected 60 percent is not so surprising.”«
[Watson et al (2017): DNA - The story of the genetic revolution].

Sounds plausible, but after some consideration there is a fallacy of division of the form “if the genome is functional, then its components are also functional”.


Cheers,

Lamarck

Joe Felsenstein said...

Lamarck: This "fallacy of division" may illuminate the original post above, but it does not tell me anything about how to interpret your declaration that Construction morphology is the "view" of evolution". That statement just seems odd and incomprehensible, however fundamental it may seem to you. Well, perhaps everyone else here gets your point, but it just goes over my head.

To me, if it's not batshit-crazy, it is at least duck-penis crazy.

John Harshman said...

Do you know why some duck species have a phallus, but many other avian species don’t?

Getting closer, but still not correct. All, not some, anseriforms have penises. So, in vestigial form, do all galliforms. All neoavians lack them. No, I don't know why. Do you? If so, how have you determined the reason?

This is just your fear in the dark, isn't it?

You seem to be the one afraid here, since you hide behind a pseudonym. What are you afraid of?

Darwin said...

Dr Moran I apologize for bothering you again but I would like to ask you a few things about evolution

For you, what is the dominant force of evolution, drift or natural selection?

Is the complexity of an organism due or not to the size of the genome or is it due to the number of genes that code for proteins?

Mutations that are neutral because they mutate in junk DNA?


The only solution to the onion test is to admit that there is junk DNA in our genome or is there another alternative?

Lamarck said...

Hi!

»Dr Moran I apologize for bothering you again but I would like to ask you a few things about evolution.«

Please allow me to do this. An elaborated forum is always a honeypot that tempts to a naive »"I have a question about evolution"« instead of reading the basics in Wikipedia. But I enjoy it...


»For you, what is the dominant force of evolution, drift or natural selection?«

Evolutionary processes are based on the biological implementation of the algorithm of trial and error (here: mutation/selection). These heuristic Evolutionary algorithms (EAs) can thus be generalized as general learning algorithms. Genetic drift as well as natural selection are resulting effects.


»Is the complexity of an organism due or not to the size of the genome or is it due to the number of genes that code for proteins?«

Theoretically, the algorithmic complexity of a genome is related to the algorithmic complexity of the associated organism. This may or may not be related to the size of the genome or the number of genes.


»Mutations that are neutral because they mutate in junk DNA?«

Organisms only have to meet the minimum condition of not interrupting the germ line. To be, or not to be, that is here the only question...


»The only solution to the onion test is to admit that there is junk DNA in our genome or is there another alternative?«

The onion thought experiment asks the question why a genome for an onion is five times larger than the genome of a human. The genome of the onion obviously has considerably more redundancy than the genome of a human. One form of redundancy can be described as junk DNA.


Cheers,

Lamarck

Lamarck said...

Hi Joe Felsenstein,

isn't it strange that the term "function" doesn't appear in science outside biology?

An exception is mathematics and the computer science derived from it (binary relation, x functions as y, directed edge in a control-flow graph (CFG)...).

In the naturalistically influenced philosophy of the first half of the 20th century, such as logical empiricism or the Vienna Circle, the concept of function was regarded as a relic of Aristotelian teleology, which must be replaced by an equivalent, purely causal formulation. This conviction is based on the idea of scientific explanations as expressed in the Hempel-Oppenheim scheme.


Cheers,

Lamarck

Lamarck said...

Hi John Harshman,

»All, not some, anseriforms have penises. So, in vestigial form, do all. All neoavians lack them.«

I do not recommend to use universal quantifications in relation to taxa. It is of course easy to say that birds can be recognized by their feathers. But it becomes more difficult to use such things in the age of dinosaurs. Is the rudiment of a phallus not a phallus or are the Galliformes not actually Anseriformes? Obviously a model is needed for phylogenetic reconstruction. Let's do it!


»No, I don't know why. Do you? If so, how have you determined the reason?«

You must have found your PhD in your corn flakes. But is it possible that someone who has come out of the American educational system can explain why there are penises? Hint: Take ducks as an example: Which came first, penis or vagina?


»You seem to be the one afraid here, since you hide behind a pseudonym. What are you afraid of?«

I may note that I am already a bit older than 10 years old. Pay me an ice cream!


Cheers,

Lamarck

John Harshman said...

I may note that I am already a bit older than 10 years old.
There is no evidence for that claim.

Joe Felsenstein said...

@Lamarck: Whether or not the word "function" appears in sciences other than biology may or may not be an interesting question, to someone other than me. But it seems to have nothing to do with explaining your view that "Construction morphology is the 'view' of evolution". Still waiting for an explanation of that. We keep hearing about dick penises and the usage of the word "function" instead.

Joe Felsenstein said...

A horrible typo on my part. I of course meant to type "duck".

Mikkel Rumraket Rasmussen said...

Now that is funny!

Joe Felsenstein said...

Alas, I is next to U on my keyboard. And I failed to edit it after typing.

John Harshman said...

More grammatically, that would by "I *am* next to U on my keyboard." And yet neither of us is on your keyboard at all, so I'm puzzled even after the correction.

ILBCNU.

Joe Felsenstein said...

That is because I next to U on your keyboard.

Darwin said...

hi Joe I would like to ask you if you saw this new scientific article about your work


https://www.google.com/url?sa=t&source=web&rct=j&url=https://www.researchgate.net/publication/332160743_Revisiting_a_Key_Innovation_in_Evolutionary_Biology_Felsenstein%27s_Phylogenies_and_the_Comparative_Method&ved=2ahUKEwihkaj1isnrAhWQjFkKHTyeD7MQFjABegQIAxAB&usg=AOvVaw20kit-OxcO20pZ_qCJgIFa

Joe Felsenstein said...

Yes, they asked me for comments and I provided them with an account of how I did that work. You will see that account in an Appendix to that paper, when you get to the end of it.

Unknown said...

Joe thank you very much for your reply I am a huge admirer of your work with phylogenetics and your rebuttal to IDs hating the weasel project along with Tom English

You currently publish content on panda thumbs

And I would like as long as I have time and ask you a question about phylogeny

Have phylogenetic inconsistencies like citromo b or c been resolved or have they never been a problem

Unknown said...

Sorry I meant citromo b or citromo c

Joe Felsenstein said...

@Unknown: this is not the place for a private conversation between you and me. You can find my email address on my Curriculum Vitae (CV) and if you want anonymity, use a special Gmail account you open for that purpose.

Darwin said...

Joe thanks for giving me the web address but I put your email in Email and it tells me that it is not valid, you could copy and paste it in a new comment, thank you very much

Darwin said...

Joe thank you very much you no longer need to write it

Lamarck said...

Hi John Harshman!

»There is no evidence for that claim.«

Certainly. But what does it matter? From me you know now that I have punch lines that make Eminem even paler than he already is. And you are now relaxed enough for a little science. By the way, my little experiment is called organism- vs. gene-centered observation:

With the ducks you see an armor race between vagina and penis. The drakes eliminate sexual selection through a rape strategy and pay for this with the considerable effort of dealing with male competition. Thus penises appear which can scratch out the sperm of their predecessors. Or accidents occur in which the birds involved drown. The expenditure for this is obviously so large that the complex, like an airbag constructed penises must be formed back outside the reproductive-time (the vagina remains). An incubated egg that will become a rooster starts to develop a penis, but early in the second week of embryonic development, a cell death protein called Bmp4 cloaks the incipient penis, causing it to stop developing and instead remain as a rudimentary nub. The gene regulating the regression is here the evolutionary highly conserved BMP4 gene (compare the BMP4 signal transduction pathway).

This "darwinian story" (thank you, creationists!) is a hypothetical representation, but here it clearly shows that an organism can only be understood top-down from the organismic construction, but not bottom-up (gene-centered). This has consequences for various questions.


Cheers,

Lamarck

Lamarck said...

Hi Joe Felsenstein,

»@Lamarck: Whether or not the word "function" appears in sciences other than biology may or may not be an interesting question, to someone other than me. But it seems to have nothing to do with explaining your view that "Construction morphology is the 'view' of evolution". Still waiting for an explanation of that. We keep hearing about dick penises and the usage of the word "function" instead.«

As is well known, the ENCODE project has the task of mapping all functional elements of the human genome. The word "function" plays a very clear role here. But how is it possible to sort elements by "functions" when base triplets, for example, can be assigned functions? Of course, common algorithmic methods for similarity determination (biology: homology vs. analogy) are used to identify and name sequential structures [sic!]. Laurence A. Moran has created a nice graphic for this purpose (https://sandwalk.blogspot.com/2018/03/whats-in-your-genome-pie-chart.html). As a rhetorical question about this: How, for example, is it determined what "defective transposons and fragments" are?

"Function", as it is used in the PR of ENCODE, is a teleological term and thus represents a gross misconception in relation to natural science. Evolution as a process naturally starts with the organismic construction in its entirety, but not with organismic functions: These are of course always static and unchangeable (Behe knows this with his "irreducible complexity" ;-) ). I use the term "construction morphology" here as a contrast method to show this fact.


Recommended reading:

Gutmann, W. F.; Vogel, K.; Zorn, H. (1978): Brachiopods: Biomechanical inter-dependences governing their origin and phylogeny. Science 199, Issue 4331, pp. 890-893. DOI: 10.1126/science.199.4331.890
Reif, W. E.; Thomas, R. D.; Fischer, M. S. (1985): Constructional morphology: the analysis of constraints in evolution dedicated to A. Seilacher in honour of his 60. birthday. Acta Biotheor. 34 (2-4): 233-48. doi: 10.1007/BF00046787.
Schmidt-Kittler, N. & Vogel, K. (1991): Constructional Morphology and Evolution.


Cheers,

Lamarck

John Harshman said...

If you would try a little less to be clever and a little more to make clear sense, you might get a little more credit. As it is, all I see is nonsense.

Joe Felsenstein said...

Evolution as a process naturally starts with the organismic construction in its entirety, but not with organismic functions: These are of course always static and unchangeable (Behe knows this with his "irreducible complexity" ;-) ). I use the term "construction morphology" here as a contrast method to show this fact.

Whatever. Let us know if this lofty view lets you have some new insight into why so much of the genome is not conserved by natural selection. That’s not obvious from your explanation.

Joe Felsenstein said...

The one thing junk DNA is not is redundant information.

Mikkel Rumraket Rasmussen said...

//"The genome of the onion obviously has considerably more redundancy than the genome of a human. One form of redundancy can be described as junk DNA.

You're missing a crucial piece of the Onion test, which is that between ostensibly similar species of onion, we also see five-fold differences in genome size. Of course, you can always just declare in ad-hoc fashion that all differences in genome-size are due to differences in the number of "redundancy", it's just that this explanation, in addition to it's total ad-hoc nature, collapses under the slightest scrutiny.
Why would one species of onion need, for example, five times more redundant transposable elements than another? Or bigger introns? Or more pseudogenes? It just doesn't make sense, and there doesn't seem to be any rhyme or reason to why some species have much more of this putative redundancy than others. Much less why most of this redundancy seems to come from things that look like degrading copies of things that are capable of facilitating their own genomic proliferation, and were active long ago.
On the other hand, it makes much better sense as the products of stochastic variations in the levels of activity of the processes that yield increases in genome size, such as transposable element activity(be they LTR retrotrasnposons, introns, repetitive elements prone to duplication, etc).

Lamarck said...

Hi John Harshman!

»If you would try a little less to be clever and a little more to make clear sense, you might get a little more credit. As it is, all I see is nonsense.«

I am just a beautiful mind that needs no recognition whatsoever. But I absolutely need you: Because if you could understand the corresponding thing, then everybody could certainly understand it...


Cheers,

Lamarck

Lamarck said...

Hi Joe Felsenstein!

»Let us know if this lofty view lets you have some new insight into why so much of the genome is not conserved by natural selection.«

Sure thing! The term "natural selection" has in this context a somewhat different content than it is found for example in Darwin. Therefore I transform the question:

What is the reason that the respective genome sizes of the species are so static within narrow limits?

And a homeostatic effect of the telomeres can be deduced from this question: In the analogy of computer technology, genome size would be the equivalent of storage size: it does not matter what load of genes or junk is on it. If it were different, there could be no horizontal gene transfer, no mutations, the corresponding differentiations would otherwise be shot out during the next replication. On the other hand, junk DNA could well be highly conserved: The bias that, for example, lethal point mutations cannot occur in junk DNA (but the repair mechanisms still have an opportunity to take effect) is contrasted by the bias for the corresponding lethal effect elsewhere.

The aforementioned also fits the RNA world hypothesis, in which DNA is responsible for waste disposal.


Cheers,

Lamarck

Lamarck said...

Hi Joe Felsenstein!

»The one thing junk DNA is not is redundant information.«

A base on a DNA strand has an information content of 2 bits, since it can take on 2^2 states {A, T, C, G}. For the human genome with a size of 3.2 Gbp, this results in the information content of Ω = 1.6 GB (SI) .

The algorithmic complexity as a measure for the structure of a string is given by the length of the shortest program generating this string. This is a synonym for lossless data compression. In the simplest case there is a string, which for example consists of a sequence of 3.200.000.000 {A}. However, only a few bits are needed to specify 3.2 x 10^9 x A . This is an excellent way to play with the term redundancy. For example, a string in the order of 10^9, consisting of an alphabet of four letters, can be extremely compressed.


Cheers,

Lamarck

Lamarck said...

Hi Rumraket!

»You're missing a crucial piece of the Onion test, which is that between ostensibly similar species of onion, we also see five-fold differences in genome size.«

In both cases, the relation genome size to complexity of organismic construction is the point. If the organismic construction "human" is included, this has an impact on the attention economics.


»Of course, you can always just declare in ad-hoc fashion that all differences in genome-size are due to differences in the number of "redundancy", it's just that this explanation, in addition to it's total ad-hoc nature, collapses under the slightest scrutiny.«

(1) Theoretical computer science: Redundancy in relation to algorithmic complexity.
(2) Deductive reasoning: If different lengths of the sequence lead to the same organism, then the amount of this difference forms the redundancy. Empirical verification: The insertion of the gene for human insulin into E. coli still leads to nothing but E. coli (Note: The possibilities for the incorporation of additional genetic material are limited).


»Why would one species of onion need, for example, five times more redundant transposable elements than another? Or bigger introns? Or more pseudogenes? It just doesn't make sense, and there doesn't seem to be any rhyme or reason to why some species have much more of this putative redundancy than others.«

The genome size of Zea mays is 2.2 Gbp and has therefore increased by almost half since the splitting of Zea luxurians just 140,000 years ago, not only through polyploidization. While in humans more than 99% of the genome is identical, the current reference genome of corn (Jiao, Y. et al. 2017) has an identity index of only 35% compared to two genetic lines, including genetic differences within the coding sections as well as in the intergenic regions. Furthermore, the genome of the variety 'Palomero' from the Mexican highlands used for popcorn production is 20% smaller than the reference genome. All mentioned lines have a different optimum compared to climatic conditions. Fits well, I think...


Jiao, Y. et al. (2017): Improved maize reference genome with single-molecule technologies. Nature 546, pp. 524–527. https://doi.org/10.1038/nature22971


Cheers,

Lamarck

Joe Felsenstein said...

@Lamarck,

Bizarre. Incomprehensible. And I repeat, the one thing that junk DNA is not is redundant information.

John Harshman said...

Joe: For sure.

Lamarck said...

Hi Joe Felsenstein!

»Bizarre. Incomprehensible.«

I think you should be more explicit here in step two…

Darwin dealt intensively with domesticated pigeons breeding to show how species can change (English Fantail, Jacobin, Pouter). He concluded from the “artificial selection” in culture to a “natural selection” in nature. So Darwin can of course only be called an adaptationist. Since Darwin knew nothing about mutation in his time, he had problems finding out the real creative process and used Lamarckism as a substitute in his 6th Ed. Darwin's theory of evolution has always remained a theory of selection at its core. Only after the clash between selectionists and mutationists, which ultimately led to the STE, was there a modern theory of evolution.

In Darwin's time it was still common to speak of Geoffroyism, Lamarckism or Darwinism instead of hypotheses or theories. Today there is some confusion about the term Darwinism, which was essentially transported by Ernst Mayr: Ernst Mayr constantly refused to speak of Darwinian theory, in order to cover also the STE with the term Darwinism. In connection with the term Darwinism the term "natural selection" was misinterpreted: While Darwin recognized "natural selection" as cause and the criterion survival or not survival was added as effect to the STE, today "natural selection" is often misinterpreted as the cause.

We look at your question again:

»[...] why so much of the genome is not conserved by natural selection.«

For Darwin, "natural selection" means the cause of the transformation of an organism from n → n'.
In your question there seems to be no equivalent for “pigeons”... ;-)

However, the question is not relevant with regard to the problem of junk DNA either.


Cheers,

Lamarck

Lamarck said...

Hi Joe Felsenstein!

»And I repeat, the one thing that junk DNA is not is redundant information.«

Please show it.

When you talk about the notion of “redundant information”, this is the subject of information theory. “Redundant information” describes the Shannon information that is present several times in an information source. Redundant information increases fault tolerance.


Cheers,

Lamarck

Mikkel Rumraket Rasmussen said...

I really can't make sense of this. You seem to be trying to rebut the inference that variance in genome size between even closely related, and highly similar species, is best explained as variations in the amount of junk-DNA(due in part to stochastic differences in selfish element activity). And you do this by suggesting it could all just be "redundancy", which you're now saying acts as a sort of "fault tolerance". But it's just not clear why two ostensibly similar species of organisms would need such massive differences in the levels of "fault tolerance" at all. Your rationalization here doesn't really seem to explain the observations at all. It's not clear why natural selection would have favored such large variations in fault-tolerance, seemingly unrelated to the types of organism, much less why these huge variants in fault-tolerance must come from primarily from things like transposons, introns, repetitive elements, and the like. Your conjecture just doesn't make sense of the data.

John Harshman said...

Interesting. Parts of that last post were coherent. It seems there are burst of lucidity, just not enough to make any sort of relevant point.

Joe Felsenstein said...

@Lamarck:


>> And I repeat, the one thing that junk DNA is not is redundant information.«

> Please show it.


Happily, with a tiny example. Here is what redundant information looks like (in an example from English text):

"The quick brown fox. The quick brown fox."

Here is an example of junk text that is not redundant:

"The quick brown fox. Ohr xugeb qxoeb yhn."

The second part of the text is redundant in the first case, and is junk in the second case. I think that perhaps you misunderstood a phrase that is redundant in the sense that it is not helping convey any information as being a phrase that is redundant in the sense of repeating the information.

And no amount of showing off how much technical knowledge you have of different kinds of information theory is helpful in discussing this.

Lamarck said...


Hi John Harshman!

»Interesting. Parts of that last post were coherent. It seems there are burst of lucidity, just not enough to make any sort of relevant point.«

You ain't seen nothing yet. But if you say you understood parts of what I said, I have to call you a scientist, maybe even a biologist. But not without peer review: For which part exactly did you manage to understand? Please do a little science... ;-)


Cheers,

Lamarck

Lamarck said...

Hi Joe Felsenstein!

»Happily, with a tiny example.«

I like your humor…

»Here is what redundant information looks like (in an example from English text): [...]«

Actually I would like to see a little more mathematics and a little less anecdotal evidence on this. But let's have a look at the complete english Pangram:

[1] »The quick brown fox jumps over the lazy dog«

This pangram, which consists of 35 letters, can also be displayed shorter as a pangram consisting of 33 letters:

[2] »A quick brown fox jumps over the lazy dog«

Between [1] and [2] there is a redundancy if it is required to optimize a sentence consisting of all 26 letters of the English language regarding the length of the string. However, the meaning of the two sentences to be compared is very similar. The optimum, namely to find a suitable set with only 26 different letters, is obviously not achievable. A comparable example for redundancy:

»Felsenstein | Flsnstn | rock stone«

»And no amount of showing off how much technical knowledge you have of different kinds of information theory is helpful in discussing this.«

However, as one of Darwin's bulldogs, I stand by your claim in this respect:

»And I repeat, the one thing that junk DNA is not is redundant information.«

If I knew what exactly you mean by “redundant information”, I could quickly develop an app that automatically analyzes a sequence and searches for it. What exactly should I search for? Maybe to non-CpG islands? ;-)


Cheers,

Lamarck

Joe Felsenstein said...

Interesting. Parts of that last post were coherent. It seems there are burst of lucidity, just not enough to make any sort of relevant point.

John Harshman said...

Apparently the troll enjoys feeding. Time to starve the troll.

Lamarck said...

Hi Rumraket!

»You seem to be trying to rebut the inference that variance in genome size between even closely related, and highly similar species, is best explained as variations in the amount of junk-DNA (due in part to stochastic differences in selfish element activity).«

(i) Not at all. I have demonstrated this using the example of corn, Zea mays: There, genetic lines with extreme differences in genome sizes can be found even within a single species. However, the genome sizes of the genetic lines are quite stable.

(ii) Although the germ line of both Rumraket and Lamarck has a different history of bombardment of events in horizontal gene transfer or double-strand breaks, we both have, despite - no, rather because of natural selection, comparable sizes of our genomes. From this follows:

(ii.j) The genome size is subject to natural selection - namely about the process to (ii.jj).
(ii.jj) The cybernetics of the sequence determines the organism, but not its arrangement or its primary structure.
(ii.jjj) If it is possible to infer the existence of genes in humans from the similarity of the sequences of genes in puffer fish, this also means that genes and their regulatory units are in principle sufficient for the viability of the organism concerned: If, as is the case here, the genome size of the pufferfish is one order of magnitude smaller than that of humans, but the number of genes is comparable, it follows that humans have to cope with far more redundancies than the pufferfish.

(iii) »Stochastic differences« play no role here.

(iv) Richard Dawkins has written a wonderful book on evolutionary game theory. Unfortunately he called it ‘The Selfish Gene’. Please do not talk about "selfish" here or put a coat on a dog...

»And you do this by suggesting it could all just be "redundancy", which you're now saying acts as a sort of "fault tolerance".«

Please delete here everything that goes beyond »redundancy <> fault tolerance .

»But it's just not clear why two ostensibly similar species of organisms would need such massive differences in the levels of "fault tolerance" at all.«

Please say goodbye to this way of thinking. The usage of the word ‘need’ is in this context a teleological fallacy.

»It's not clear why natural selection would have favored such large variations in fault-tolerance, seemingly unrelated to the types of organism, much less why these huge variants in fault-tolerance must come from primarily from things like transposons, introns, repetitive elements, and the like.«

The genome must be fault-tolerant: it has to cope with things like mutations. But without mutations there is no biological evolution. Since the probability of survival is not directly measurable, we therefore have to look at it qualitatively rather than quantitatively: No matter how full the garage is with junk, the main thing is that the car still fits.


Cheers,

Lamarck