More Recent Comments

Wednesday, May 05, 2021

Lab leak conspiracy theory rears its ugly head again: this time it's Nicholas Wade of the New York Times

Nicholas Wade used to be a serious science writer but he lost that title many years ago when he proved that he was incapable of distinguishing fact from wishful thinking [Nicholas Wade on the Origin of Life ]. Now he's gone completely bonkers by promoting the ridiculous conspiracy theory that the COVID-19 pandemic was started when the SARS-CoV-2 virus leaked from a lab at the Whuhan Institute of Virology (WIV) [Origin of Covid — Following the Clues].

Nicholas Wade claims that the virologists at the WIV, led by Dr. Shi, created the SARS-CoV-2 virus by genetic engineering. Their goal, according to Wade, was to make a virus that was as deadly to humans as possible in order to study its effects in the lab. Unfortunately, the virus escaped from the lab, according to Wade, and started the pandemic.

Shi Zhengli responded to those silly accusations in July 2020 [Wuhan coronavirus hunter Shi Zhengli speaks out].

On 15 July, Shi emailed Science answers to a series of questions about the virus' origin and her research. In them, she hit back at speculation that the virus leaked from WIV. She and her colleagues discovered the virus in late 2019, she says, in samples from patients who had a pneumonia of unknown origin. “Before that, we had never been in contact with or studied this virus, nor did we know of its existence,” Shi wrote.

“U.S. President Trump's claim that SARS-CoV-2 was leaked from our institute totally contradicts the facts,” she added. “It jeopardizes and affects our academic work and personal life. He owes us an apology.”

Why is this a conspiracy theory? Because the speculation has been investigated by WHO scientists who found no evidence to support it. They saw that the lab protocols at the Institute were very good, as you would expect for a world class lab that was studying dangerous viruses that were known to cause pandemics. Furthermore, none of the workers at the lab tested positive for COVID-19 and none of them were studying any virus that resembled SARS-CoV-19. So, in order for the lab leak hypothesis to be true there has to have been a massive coverup by a very large number of people. That's what makes it a conspriacy theory.

Nicholas Wade gets a lot of his information from Richard Ebright who has been promoting the lab leak conspiracy theory for the past year. Ebright thinks the WHO investigators "... were willing—and in at least one case, enthusiastic—participants in disinformation" [An Interview with Richard Ebright: The WHO Investigation Members Were “participants in disinformation”]. This is classic conspiracy theory stuff: everyone who disagrees with you is part of the conspiracy.

If you still think the lab leak conspiracy theory is true then I urge you to watch this video of a talk by Professor Edward ("Eddy") Holmes, the 2020 New South Wales (Australia) scientist of the year and an expert on human viruses, especially the coronoviruses [The Discovery and Origins of SARS-CoV-2]. He explains why the viruses are likely to orginate in bats and explains why this particular virus started off in bats but probably passed though an intermediate host before reaching humans. (His preferred intermediate host is racoon dogs and he explains why he thinks this is likely.) He explains why the sequence of the virus is entrely consistent with a natural origin. He describes his field work in China and Southeast Asia and his collaborations with the expert scientists in China, including those at the Wuhan Institute of Virology.

Holmes, addreses the conspiracy theory at 41:45 minutes into the talk so you can skip rght to there if you like—although I don't recommend it because there's lots of useful information in the first 40 minutes. Here's why he rejects that cosnspiracy theory and why you should too. These are the facts, according to Holmes. I agree with him.

  • There's "no evidence that SARS-CoV-2 is engineered (and no reason to bioengineer a random bat virus)." Holmes calls this idea is "absolute nonsense." I'm guessing he won't be a fan of Nicholas Wade's article.
  • "Bat virus RaTG13 is not the direct ancestor of SAR-CoV-2—all the components of the virus exist in nature."
  • "No evidence of a secret SARS-CoV-2-like virus kept at the WIV (and no reason to keep it a secret before the pandemic)." The scientists at WIV say that they were not studying such a virus and Holmes says, "Frankly, I believe them." Nicholas Wade thinks they are lying but offers no proof and no reason to justify the lie.
  • The SARS-CoV-2 virus is probably not directly from bats and WIV was only studying bat viruses. Furthermore, the virus is probably not from Yunnan province where the Wuhan Institute of Virology is located.
  • "SARS-CoV-2 was not perfectly adapted to humans on first emergence and appears to be a "generalist" virus." Nicholas Wade is wrong about this as well.
  • "Cases near WIV only appeared later in the outbreak." The first cases in Wuhan appear in the market, specifically in the area where live animals are sold. This strongly suggests that the virus came from animals in the market and that it originated in those animals somewhere else. There were cases in December 2019 that were not linked to the market but they were nowhere near the WIV.
  • "No evidence of SARS-CoV-2 infection at WIH—staff were PCR/antibody negative." Holmes says that if this is true then that rules out the lab leak hypothesis automatically. He's says that either this is the biggest coverup in history and they're all lying or there's no evidence at all that the virus was ever in the lab. He concludes that the virus did not come from the lab but he's sure that the conspiracy theory is not going to go away anytime soon.

Holmes is right. The conspiracy theory is not going away because its proponents think that all Chinese are evil and can't be trusted. Those conspiracy believers are wrong. Please don't spread this ridiculous idea; it makes you no better than QAnon cultists.

If you're really interested in the facts then there are several articles on the origin of SARS-CoV-2 that you should read before falling for the lab leak conspiracy thoery. Here's one.

MacLean, O.A., Lytras, S., Weaver, S., Singer, J.B., Boni, M.F., Lemey, P., Pond, S.L.K. and Robertson, D.L. (2021) Natural selection in the evolution of SARS-CoV-2 in bats created a generalist virus and highly capable human pathogen. PLoS Biology 19:e3001115. [doi: 10.1371/journal.pbio.3001115]

Virus host shifts are generally associated with novel adaptations to exploit the cells of the new host species optimally. Surprisingly, Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has apparently required little to no significant adaptation to humans since the start of the Coronavirus Disease 2019 (COVID-19) pandemic and to October 2020. Here we assess the types of natural selection taking place in Sarbecoviruses in horseshoe bats versus the early SARS-CoV-2 evolution in humans. While there is moderate evidence of diversifying positive selection in SARS-CoV-2 in humans, it is limited to the early phase of the pandemic, and purifying selection is much weaker in SARS-CoV-2 than in related bat Sarbecoviruses. In contrast, our analysis detects evidence for significant positive episodic diversifying selection acting at the base of the bat virus lineage SARS-CoV-2 emerged from, accompanied by an adaptive depletion in CpG composition presumed to be linked to the action of antiviral mechanisms in these ancestral bat hosts. The closest bat virus to SARS-CoV-2, RmYN02 (sharing an ancestor about 1976), is a recombinant with a structure that includes differential CpG content in Spike; clear evidence of coinfection and evolution in bats without involvement of other species. While an undiscovered “facilitating” intermediate species cannot be discounted, collectively, our results support the progenitor of SARS-CoV-2 being capable of efficient human–human transmission as a consequence of its adaptive evolutionary history in bats, not humans, which created a relatively generalist virus.


Monday, May 03, 2021

More illusions/delusions of James Shapiro and Denis Noble

It was just a few weeks ago that I discussed short articles by Denis Noble and James Shapiro that were published in the journal Biosemiotics [The illusions of Denis Noble] [The illusions of James Shapiro].

Several readers questioned whether Biosemiotics is a real science journal and they were right: it's a kooky journal and that's why it publishes papers by kooks. However, we now have a new paper by Shapiro and Noble that's about to appear in a legitimate scientific journal; albeit, one that has seen better days. This would normally raise red flags concerning peer review but we're long past the time when we can count on peer review to weed out the kooks.

Here's the paper. I'm not going to discuss all the main points because they were covered in my previous posts. I'll just concentrate on the most ridiculous part in order to illustrate the (lack of) quality of this paper.1

Shapiro, J. and Noble, D. (2021) What prevents mainstream evolutionists teaching the whole truth about how genomes evolve? Progress in Biophysics and Molecular Biology. [doi: 10.1016/j.pbiomolbio.2021.04.004]

The common belief that the neo-Darwinian Modern Synthesis (MS) was buttressed by the discoveries of molecular biology is incorrect. On the contrary those discoveries have undermined the MS. This article discusses the many processes revealed by molecular studies and genome sequencing that contribute to evolution but nonetheless lie beyond the strict confines of the MS formulated in the 1940s. The core assumptions of the MS that molecular studies have discredited include the idea that DNA is intrinsically a faithful self-replicator, the one-way transfer of heritable information from nucleic acids to other cell molecules, the myth of “selfish DNA,” and the existence of an impenetrable Weismann Barrier separating somatic and germ line cells. Processes fundamental to modern evolutionary theory include symbiogenesis, biosphere interactions between distant taxa (including viruses), horizontal DNA transfers, natural genetic engineering, organismal stress responses that activate intrinsic genome change operators, and macroevolution by genome restructuring (distinct from the gradual accumulation of local microevolutionary changes in the MS). These 21st Century concepts treat the evolving genome as a highly formatted and integrated Read-Write (RW) database rather than a Read-Only Memory (ROM) collection of independent gene units that change by random copying errors. Most of the discoverers of these macroevolutionary processes have been ignored in mainstream textbooks and popularizations of evolutionary biology, as we document in some detail. Ironically, we show that the active view of evolution that emerges from genomics and molecular biology is much closer to the 19th century ideas of both Darwin and Lamarck. The capacity of cells to activate evolutionary genome change under stress can account for some of the most negative clinical results in oncology, especially the sudden appearance of treatment-resistant and more aggressive tumors following therapies intended to eradicate all cancer cells. Knowing that extreme stress can be a trigger for punctuated macroevolutionary change suggests that less lethal therapies may result in longer survival times.

The section on "selfish DNA" is the one that seems to have the highest number of misleading and false statements per paragraph.

1.4. The end of “selfish” or “junk” DNA

A major shortcoming of the MS is that it was based on a “gene-centric” view, which assumed that the genome is basically a collection of “genes” that are the protein-coding units of heredity and heritable variation. As we saw in the quotation from Goldschmidt's 1940 book, this view failed to take the evolutionary importance of chromosome structure into account (Goldschmidt, 1940). It also blinded evolutionary biologists to the importance of McClintock's mid- 20th Century discovery of mobile “controlling elements” (McClintock, 1987). Both the ideas of genetic transposition and control of gene expression by these non-coding mobile elements did not fit within the narrow confines of the MS concepts of genome function and variation. A further empirical assault on the limited MS conceptual framework came in the late 1960s when Britten and Kohne discovered that a significant fraction of genomic DNA from complex eukaryotes consists of highly repetitive sequences rather than the unique coding sequences expected to make up the hereditary material (Britten and Kohne, 1968).

  • The title is ridiculous since no respectable scientist ever equated selfish DNA with junk DNA [Selfish genes and transposons].

  • The Modern Synthesis (MS) was not based on a "gene-centric" view.
  • For the past 50 years, no respectable scientist, and no knowledgeable expert in molecular evolution, has restricted the definition of "gene" to just protein-coding genes.
  • For the past 50 years, no expert in molecular evolution has ever thought that the genome is just a collection of protein-coding genes.
  • For the past 50 years, experts in molecular biology have known about transposons and have considered the view that some of them might be "controlling elements." They have concluded that most transposon-related sequences are just fragments of defective transposons with no biological function.
  • Nobody cares whether mobile genetic elements fit within the narrow confines of the Modern Synthesis as described by Huxley and other in the 1940s because no exeprt in molecular evolution has believed in that view of evolution since the late 1960s.
  • The Britten and Kohne paper established that the genomes of most multicellular eukaryotes contain large amounts of repetivie DNA. This was an attempt to resolve the C-value paradox. Britten and Kohne didn't like the idea that this could be junk DNA so they offered some speculation about function. However, futher data established that most of this repetitive DNA is, indeed, junk and Britten and Kohn's speculations have been discredited. Britten and Kohn were attempting to interpret their result within the context of the adaptationist views that characterized the the Modern Synthesis back then. The correct interpretation of their results came with the overthrow of the Modern Synthesis and the adoption of a new view of evolutionary theory that focused on Neutral Theory, Nearly-Neural Theory, and the importance of random geneitc drift. Shaprio and Noble missed that revolution so they continue to attack an old-fashioned strawman version of evolutionay theory.

Before continuing, it's important to realize that by the early 1970s selectionist thinking had been abandoned by the experts in genome evolution. By 1978 Gould and Lewontin tried, unsccessfully, to convince all other biologists to abandon the old selectionist way of thinking [The Spandrels of San Marco and the Panglossian Paradigm]. James Shapiro and Denis Noble are among those other biologists who didn't get the message.

In order to apply selectionist thinking to explain the presence of so much non-coding DNA, evolutionary biologists called this unexpected portion of the genome “junk DNA” (Ohno, 1972) or “selfish DNA” (Orgel and Crick, 1980). Richard Dawkins used an extreme view of these “selfish genes” to erect a whole philosophy of strictly passive evolutionary gradualism (Dawkins, 1976). Today we know that the human genome contains at least 30X as much repetitive non-coding DNA as protein-coding sequences (Lander et al., 2001). Repetitive DNA provides formatting signals for transcription, epigenetic modification and chromosome mechanics and also is the most variable component in the evolutionary diversification of complex genomes (Symonová and Howell, 2018; Subirana et al., 2015; Matsubara et al., 2016; CioffiMde et al., 2015; Chalopin et al., 2015; Shao et al., 2019; Böhne et al., 2008; Li et al., 2016; Oliver et al., 2013). A 2013 plot of organismal complexity against protein-coding and non-coding DNA showed that coding DNA peaked at approximately ∼3 × 107 bp, while the non-coding DNA increased linearly with growing complexity up to ∼2–3 x 1010 bp (Liu et al., 2013). In other words, non-coding DNA tracked organismal complexity better than the protein-coding genes. The “encyclopedia of DNA elements” (ENCODE) project, which largely abandoned the term “gene,” revealed that the large majority of the so-called junk DNA is actively transcribed in a regulated manner, indicating that it is functional (Consortium, 2012; Pennisi, 2012).

  • It is completely, totally, ridiculous to say that the idea of junk DNA was due to selectionist thinking. The first statement in this paragraph is powerful evidence that Shaprio and Noble don't know what they are talking about. The concept of junk DNA is a rejection of selectionist thinking.
  • The use of "noncoding DNA" is what's called a "tell."
  • Again, equating junk DNA with selfish DNA is stupid. If all the excess DNA were selfish then it isn't junk because it has a function.
  • Richard Dawkins' view on evolution is closer to the old-fashioned adaptationist view that was abandoned by the experts by the time he wrote The Selfish Gene. Dawkins book is not really about "genes," however, as is clear to anyone who has read it. He's talking about any piece of DNA that confers a fitness advantage. The Dawkins strawman is a favorite target of the Third Way types but it's just a strawman.
  • No significant proportion of repetitive DNA has a function in spite of the references quoted above.
  • There is no significant correlation between organismal compexity and noncoding DNA. Lots of very similar species, such as onions, have very different genome sizes.
  • No knowledgeable scientist since the 1980s thinks there should be a significant correlation between the number of genes and organismal complexity. We know that most of the phenotypic differences between multicellular species are due to changes in the timing and amount of expression of a standard set of genes. This is the main discovery of evolutionary-developmental biology (evo-devo), another revolution that Shapiro and Nobel missed. They should educate themselves by reading Sean B. Carroll's books.
  • The ENCODE researchers did lots of silly things but they did NOT abandon the term "gene."
  • The idea that most of our genome is functional because of ENCODE is laughable in 2021. The fact that Shapiro and Noble would bring this up is another "tell" and the fact that they would reference Elizabeth Pennisi is even more revealing. These guys are incapable of thinking critically.

Shaprio and Noble then describe a few examples of repetitive DNA sequences that have a known function and they point out that a number of noncoding genes have been indentified. They imply that these functional sequences make up a signifcant fraction of the genome thus calling the concept of junk DNA into question. They close the section with,

Clearly, none of the eminent scientists who wrote about junk or selfish DNA could possibly have imagined the wide range of cellular functionalities that we know today are executed by ncRNA molecules. The idea that a genome was just a collection of protein coding sequences has proved completely inadequate.

  • I don't know about you, dear reader, but I'll match those "eminent scientists" against Shapiro and Noble any day. I'd love to see them try to defend their views in a public debate against some of the leading proponents of junk DNA. I know where my money would be.

Let me close by quoting the last chapter of this paper. I don't intend to comment on it except to say that it gives new meaning to the word "irony."

The campaign to sustain the Modern Synthesis causes real harm in a number of different ways. Among doctors treating bacterial infections, ignorance of real-world evolutionary processes has led to a situation in which the available antibiotics have lost their effectiveness against many life-threatening conditions (CDC et al., 2019). Among the general public, the inability to comprehend the potential all living organisms possess for transferring and reorganizing genomic configurations makes them unprepared to form sound judgements about how society should utilize its growing arsenal of biotechnology tools acquired from our microbial neighbors, like CRISPR (Doudna, 2020). Among oncologists, MS thinking prevents the practitioners treating cancer patients from recognizing the dangers of overtreating tolerable tumors in ways that may provoke a macroevolutionary transition to a far more lethal and untreatable disease (Heng, 2019). Finally, in the battle against obscurantism and anti-evolution prejudice, insistence on an outdated set of assertions about how life can change itself leaves the defenders of rigorous scientific inquiry without satisfactory responses to critics. Clearly, the time has come for the mainstream evolution community to recognize and join the scientific reality of the 21st Century.

Finally, one of the most important properties of kooks is that they find each other and they tend to hang out together, either physically or virtually. I'm not sure why this happens since they often espouse mutually exclusive views. I'm guessing that we can explain it in two different ways: (1) they are all outsiders fighting against a common enemy; namely, real science, and (2) they lack critical thinking skills so they don't see the flaws in each other's arguments.


1. In case you didn't recognize the quality from the title.

Thursday, April 29, 2021

Chromatin organization at promoters in yeast cells

Our genome is very large and very complicated because it is full of junk DNA. It contains thousand of sites where DNA binding proteins can bind just by chance. This leads to the reorganization of nucleosomes in a way that mimics functional sites. It's difficult to distinguish these spurious sites from real functional sites and that has led to much confusion in the scientific literature.1

The yeast genome is much more simple and it's safe to assume that almost all of the sites detected by the standard chromatin assays are genuine, biologically relevant, sites. In that sense, it serves as a model for what functional sites looks like. A recent paper in Nature (April 8, 2021) reports on the mapping of most of the sites in the yeast genome where DNA binding proteins are found.

Rossi, M.J., Kuntala, P.K., Lai, W.K., Yamada, N., Badjatia, N., Mittal, C., Kuzu, G., Bocklund, K., Farrell, N.P., Blanda, T.R.M., Joshua D, V, B.A., Mistretta, K.S., Rocco, D.J., Perkinson, E.S., Kellogg, G.D., Mahony, S. and Pugh, B.F. (2021) A high-resolution protein architecture of the budding yeast genome. Nature 592:309-314. [doi: 10.1038/s41586-021-03314-8]

Origins of replication

Origins of replication are also called autonomously replicating sequence consensus sequences (ACS). There are 253 of them in the yeast genome and they are characterized by a 300 bp nucloeosome-free region that's occupied by the origin recognition complex (ORC) and the helicase MCM.

Telomeres

Telomeres are bound by a number of proteins including silent information regulators (SIRs). There's a nucleosome-free region of about 300 bp. where these proteins are located.

Centromeres

The nucleosome-free region at centromeres covers only 170 bp where a number of centromere binding proteins are located. The absence of nucleosomes at the centromere is a surprise since it was though that centromere DNA was bound by modified nucleosomes containing a specific histone variant.

Tuesday, April 27, 2021

Asymptomatic and presymptomatic spread of SARS-CoV-2

It is widely believed that a substantial amount of viral spread is due to individuals who are transmitting the virus but have no symptoms (asymptomatic spread) but there's so much misinformation about COVID-19 out there that I'm having trouble sorting out real science from fake science so I've become skeptical of just about everything.

I'm not talking about the kind of fake science being spread on FOX News, I'm also talking about misinformation spread by ordinary people like me and the typical readers of this blog. We might do it inadvertantly but it's still wrong.

What's the real data on asymptomatic spread? I don't know, but here's a summary of the issue in a recent issue of Science. It sounds good to me because the authors take steps to address questions that seem obvious.

Rasmussen, A.L. and Popescu, S. V. (2021) SARS-CoV-2 transmission without symptoms. Science 371: 1204-1207. [doi: 10.1126/science.abf9569]

Sunday, April 25, 2021

Happy DNA day 2021!

It was 68 years ago today that the famous Watson and Crick paper was published in Nature along with papers by Franklin & Gosling and Wilkins, Stokes, & Wilson. Threre's a great deal of misinformation circulating about this discovery so I wrote up a brief history of the events based largely on Horace Freeland Judson's book The Eighth Day of Creation. Every biochemistry and molecular biology student must read this book or they don't qualify to be an informed scientist. However, if you are not a biochemistry student then you might enjoy my short version.

Some practising scientists might also enjoy refreshing their memories so they have an accurate view of what happened in case their students ask questions.

The Story of DNA (Part 1)

Where Rosalind Franklin teaches Jim and Francis something about basic chemistry.

The Story of DNA (Part 2)

Where Jim and Francis discover the secret of life.

Here are some other posts that might interest you on DNA Day.



Wednesday, April 21, 2021

Douglas Axe pretends to be an expert on intelligent design

This is a really interesting video presentation by Dougla Axe, a leading proponent of Intelligent Design Creationism. He's criticizing the argument from poor design; an argument that attempt to refute intelligent design by pointing out examples of poor design that a creator would never create. Axe uses an example from Neil deGrasse Tyson and if you look at this objectively you would say that Axe does a pretty good job of refuting Tyson's claims.

Tyson is not a biologist and he shouldn't pretend to be one, but that's not the most interesting take-home lesson from this video. The most interesting point concerns the comments Douglas Axe makes at the end of the video beginning at 11:30 minutes. He claims that Neil deGrasse Tyson is not an expert on designing life so it's foolish of him to pretend that he knows anything about the subject. When you hear someone making an imperfect design argument he asks his listeners to challenge them by saying, "What have YOU made that you think qualifies you to critique life."

Yep. He actually said that! Someone who promotes intelligent design without any experience in designing life actually tried to use that argument against opponents of intelligent design.

God has an inordinate fondness for beetles.


J.B.S. Haldane

The burden of proof is on Intelligent Design Creationists to demonstrate how their view is compatible with science and with the history of life. They have to demonstrate why it took 3.5 billion years to get where we are today and why the history of life is so compatable with evolution. They have to demonstrate why millions of species of bacteria and almost as many species of beetles can only be explained by the actions of an intelligent designer. They have to explain why all the data shows that modern humans and chimpanzees have descended by gradual fixation of mutations from a common ancestor that lived only a few million years ago. They have to explain why an intelligent designer would design a genome that's 90% junk.

These creationists haven't made anything that qualifies them to be experts on the design of life1 but I'm willing to listen to any ideas they have. So far, all we've seen is criticisms of evolution, which is also a topic where they lack expertise.


The Haldane quotation is accurate. See “"A Special Fondness for Beetles" by Stephen J. Gould in Dinosaur in a Haystack.

1. Unless they have some special insight into the mind of god in which case they should be able to tell us exactly how he did it. Why did he create all those strange animals in the Cambrian only to allow most of them to go exinct? And speaking of extinctions, what did he have against most dinosaurs that he decided to kill them by smashing a meteor into the Earth 66 million years ago? Can you explain that, Dr. Axe?

The illusions of James Shapiro

James A. Shapiro is a professor in the Department of Biochemistry and Molecular Biology at the University of Chicago (Chicago, USA). He made signficant contributions to our understanding if the function and structure of transposons but in later years he has become a vocal opponent of evolution culminating in his 2011 book Evolution: A View from the 21st Century. He is one of the founding members of The Third Way of Evolution.

I wrote a critical review of Evolution: A View from the 21st Century for the National Center for Science Education (NCSE) Reports but the issue is no longer visible on the web. Shapiro didn't like my review so NCSE published his rebutal and that's also unavailable. You can see my response at: James Shapiro Responds to My Review of His Book.

Monday, April 19, 2021

The illusions of Denis Noble

Denis Noble was a Professor of Physiology at Oxford University in the United Kingdom until he retired. He had a distinguished career as a physiologist making significant contributions to our undestanding of the heart and its relationship to the whole organism.

In recent years, Noble has dabbled in philosophy and evolution. He has become a vocal opponent modern evolution (sensu Noble) and the way science is currently conducted. Some of his criticisms have made it onto two popular books: The Music of Life and Dance to the Tune of Life. He is one of the leading proponents of the "Extended Evolutionary Synthesis" (EES) and he is one of the founders of The Third Way of Evolution, a wishy-washy and scientifically inaccurate way of attacking a strawman version of evolution and providing a safe haven for religious scientists.

Saturday, April 17, 2021

Philosophers argue that scientific conclusions need not be accurate, justified, or believed by their authors

A remarkable paper has just been posted to a philosophy of science preprint website. (It will be published in Synthase.) Like many papers in this field it's difficult to read and the logic is obtuse but the bottom line is that scientists don't really need to be held to the old standards that we scientists used to think are essential.

Dang, Haixin and Bright, Liam Kofi (2021) Scientific Conclusions Need Not Be Accurate, Justified, or Believed by their Authors. PhilSci Archive {PDF]

We argue that the main results of scientific papers may appropriately be published even if they are false, unjustified, and not believed to be true or justified by their author. To defend this claim we draw upon the literature studying the norms of assertion, and consider how they would apply if one attempted to hold claims made in scientific papers to their strictures, as assertions and discovery claims in scientific papers seem naturally analogous. We first use a case study of William H. Bragg’s early 20th century work in physics to demonstrate that successful science has in fact violated these norms. We then argue that features of the social epistemic arrangement of science which are necessary for its long run success require that we do not hold claims of scientific results to their standards. We end by making a suggestion about the norms that it would be appropriate to hold scientific claims to, along with an explanation of why the social epistemology of science—considered as an instance of collective inquiry—would require such apparently lax norms for claims to be put forward.

Tuesday, April 13, 2021

How do you explain evolution to non-experts?

I spent a lot of time explaining evolution in my book. The goal is to educate readers to the level where they can understand the drift-barrier hypothesis and why slightly deleterious mutations can accumulate in species with small populations. This requires some knowledge of random genetic drift and some knowledge of Neutral Theory and Nearly-Neutral Theory. The emphasis is on population genetics as the most important way of understanding evolution.

You can't understand genomes and junk DNA unless you have a firm understanding of evolution. In fact, you can't make sense of anything about genes and gene expression without such knowledge ... what the heck, nothing in all of biology makes sense if you don't know about evolution.

My approach hasn't been copied by popular websites. They usually misrepresent evolution by presenting it as adaptation; natural selection is the only game in town. I'll put in a link to Francis Collins describing evolution in truly bizarre narration but my question for Sandwalk readers is whether this is useful or not. Is it better to dumb down evolution on the NIH: National Huamn Genome Website [Evolution] or is this a bad idea?


Friday, April 09, 2021

Should we teach genomics and evolution to medical students?

Rama Singh,1 a biology professor at McMaster Universtiy in Hamilton (Ontario, Canada) has just published an interesting article on The Conversation website. It's about Medical schools need to prepare doctors for revolutionary advances in genetics. You can read the full article yourself but let me highlight the last few paragraphs to start the discussion.

Future physicians will be part of health networks involving medical lab technicians, data analysts, disease specialists and the patients and their family members. The physician would need to be knowledgeable about the basic principles of genetics, genomics and evolution to be able to take part in the chain of communication, information sharing and decision-making process.

This would require a more in-depth knowledge of genomics than generally provided in basic genetics courses.

Much has changed in genetics since the discovery of DNA, but much less has changed how genetics and evolution are taught in medical schools.

In 2013-14 a survey of course curriculums in American and Canadian medical schools showed that while most medical schools taught genetics, most respondents felt the amount of time spent was insufficient preparation for clinical practice as it did not provide them with sufficient knowledge base. The survey showed that only 15 per cent of schools covered evolutionary genetics in their programs.

A simple viable solution may require that all medical applicants entering medical schools have completed rigorous courses in genetics and genomics.

Here's the problem. I've just finished research on a book about modern evolution and genomics so I think I know a little bit about the subject. I'm also on the editorial board of a journal that publishes research on biochemistry and molecular biology education. I've written a biochemistry textbook and I have far too many years of experience trying to teach this material to graduate students and undergraduates at the University of Toronto. I can safely say that we (university teachers) have done a horrible job of teaching evolution and genomics to our students. We have turned out an entire generation of students who don't understand modern molecular evolution and don't understand what's in your genome.

What this means is that there's an extremely small pool of students who have completed "rigorous courses in genetics and genomics." Nobody will be able to apply to medical school. I doubt that we could teach this material to medical students with or without the appropriate background.

But you don't have to take my word for it. Some people have tried to teach this material to health science workers so we can see how it's working at that level. Take a look at the The Genomics Education Programme supported by the NHS in the United Kingdom. They have a series of short videos and longer lessons that are designed to educate health care specialists. Here's the blurb that defines their objective.

Rapid advances in technology and understanding mean that genomics is now more relevant than ever before. As genomics increasingly becomes a part of mainstream NHS care, all healthcare professionals, and not just genomics specialists, need to have a good understanding of its relevance and potential to impact the diagnosis, treatment and management of people in our care.

In 2014, Health Education England (HEE) launched a four-year £20 million Genomics Education Programme (GEP) to ensure that our 1.2 million-strong NHS workforce has the knowledge, skills and experience to keep the UK at the heart of the genomics revolution in healthcare.

Funding for the programme has since been extended to enable us to continue our work in providing co-ordinated national direction of education and training in genomics and developing resources for a wide range of professionals.

They describe genes as 'coding' genes that build proteins. There's no mention of noncoding genes. The define a genome as "both genes (coding) and non-coding DNA." They also say that your genome is all of the DNA in our cells (46 chromosomes, 23 pairs). I don't see anything in their education packages that covers modern molecular evolution. In one of the packages they say,

The term ‘junk DNA’ has been used since the 1970s to describe non-coding regions of the genome, but today it is considered inaccurate and misleading. The term ‘junk’ suggests that 98% of the genome has no use, but in recent years, studies and projects have used advances in technology to shed light on these regions and have come to different conclusions about how much of the genome has a biological function.

Here's a link to a short video called What is a genome?. I recommend that you watch it to see the level that these experts think is suitable for health care professionals in the UK and to see the level of expertise of those who made the video. This is what seven years of work by experts and £20 million will get you.

All of this tells me that teaching genomics and evolution to medical students is going to be a lot more difficult than Rama Singh imagines. Not only would we have to counter several years of misinformation but we would have to rely on teachers who probably don't understand either topic.

Let's start by teaching these things correctly to biology and biochemistry majors. That's going to be hard enough for now.


1. Full displosure: Rama and I shared an NSERC grant in 1981 on genetic variation in Drosophila.

SARS-CoV-2 mRNA vaccines: RNA + lipid nanoparticles

The new mRNA vaccines are the result of extensive research over the past thirty years or so. They are marvels of technological innovation but probably not just for the reasons you imagine. The basics of therapeutic mRNA synthesis have been around for about ten years but the problem was how to get the RNA into cells. That requires specialized lipid nanoparticles and making those has been the most recent technological advance. A lot of this research was done in Canada. I found a nice paper (Buschmann et al., 2021) that covers this research and I'll summarize the important points for those of you don't have time to read it.

The mRNA

Normal messenger RNA is susceptable to nuleases and is not readily taken up by human cells. In addition, it elicits an innate immune response that results in supression of translation through phosphorylation of eIF2a. The immune response can be blocked by incorporating modified nucleotides than are not recognized by the various receptors that stimulate the normal response. This was discovered over ten years ago. These modified nucleotides, such as N1-methylpseudouridine, were used to make the SAR-CoV-2 vaccine.

Thursday, April 08, 2021

On the accuracy of genomics in detecting disease variants

Several diseases, such as cancers, are caused by the presence of deleterious alleles that affect the function of a gene. In the case of cancer, most of the mutations are somatic cell mutations—mutations that have occurred after fertilization. These mutations will not be passed on to future generations. However, there are some variants that are present in the germline and these will be inherited. A small percentage of these variants will cause cancer directly but most will just indicate a predisposition to develop cancer.

There are a host of other diseases that have a genetic component and the responsible alleles can also be present in the germline or due to somatic cell mutations.

Over the past fifty years or so there has been a lot of hype associated with the latest technological advances and the ability to detect deleterious germline mutations. The general public has been repeatedly told that we will soon be able to identify all disease-causing alleles and this will definitely lead to incredible medical advances in treating these diseases. Just yesterday, for example, I posted an article on predictions made by The National Genome Research Institute (USA) who predicts that by 2030,

The clinical relevance of all encountered genomic variants will be readily predictable, rendering the diagnostic designation ‘variant of uncertain significance (VUS)’ obsolete.

Similar predictions, in various forms, were made when the human genome project got under way and at various time afterword. First there was the 1000 genomes project then there was the 100,000 genome project and, of course, ENCODE. The problem is that genomics hasn't lived up to these expectations and there's a very good reason for that: it's because the problem is a lot more difficult than it seems.

One of the Facebook groups that I follow (Modern Genetics & Technology)1 alerted me to a recent paper in JAMA that addressed the problem of genomics accuracy and the prediction of pathogenic variants. I'm posting the complete abstract so you can see the extent of the problem.

AlDubayan, S.H., Conway, J.R., Camp, S.Y., Witkowski, L., Kofman, E., Reardon, B., Han, S., Moore, N., Elmarakeby, H. and Salari, K. (2020) Detection of Pathogenic Variants With Germline Genetic Testing Using Deep Learning vs Standard Methods in Patients With Prostate Cancer and Melanoma. JAMA 324:1957-1969. [doi: 10.1001/jama.2020.20457]

Importance Less than 10% of patients with cancer have detectable pathogenic germline alterations, which may be partially due to incomplete pathogenic variant detection.

Objective To evaluate if deep learning approaches identify more germline pathogenic variants in patients with cancer.

Design Setting, and Participants A cross-sectional study of a standard germline detection method and a deep learning method in 2 convenience cohorts with prostate cancer and melanoma enrolled in the US and Europe between 2010 and 2017. The final date of clinical data collection was December 2017.

Exposures Germline variant detection using standard or deep learning methods.

Main Outcomes and Measures The primary outcomes included pathogenic variant detection performance in 118 cancer-predisposition genes estimated as sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). The secondary outcomes were pathogenic variant detection performance in 59 genes deemed actionable by the American College of Medical Genetics and Genomics (ACMG) and 5197 clinically relevant mendelian genes. True sensitivity and true specificity could not be calculated due to lack of a criterion reference standard, but were estimated as the proportion of true-positive variants and true-negative variants, respectively, identified by each method in a reference variant set that consisted of all variants judged to be valid from either approach.

Results The prostate cancer cohort included 1072 men (mean [SD] age at diagnosis, 63.7 [7.9] years; 857 [79.9%] with European ancestry) and the melanoma cohort included 1295 patients (mean [SD] age at diagnosis, 59.8 [15.6] years; 488 [37.7%] women; 1060 [81.9%] with European ancestry). The deep learning method identified more patients with pathogenic variants in cancer-predisposition genes than the standard method (prostate cancer: 198 vs 182; melanoma: 93 vs 74); sensitivity (prostate cancer: 94.7% vs 87.1% [difference, 7.6%; 95% CI, 2.2% to 13.1%]; melanoma: 74.4% vs 59.2% [difference, 15.2%; 95% CI, 3.7% to 26.7%]), specificity (prostate cancer: 64.0% vs 36.0% [difference, 28.0%; 95% CI, 1.4% to 54.6%]; melanoma: 63.4% vs 36.6% [difference, 26.8%; 95% CI, 17.6% to 35.9%]), PPV (prostate cancer: 95.7% vs 91.9% [difference, 3.8%; 95% CI, –1.0% to 8.4%]; melanoma: 54.4% vs 35.4% [difference, 19.0%; 95% CI, 9.1% to 28.9%]), and NPV (prostate cancer: 59.3% vs 25.0% [difference, 34.3%; 95% CI, 10.9% to 57.6%]; melanoma: 80.8% vs 60.5% [difference, 20.3%; 95% CI, 10.0% to 30.7%]). For the ACMG genes, the sensitivity of the 2 methods was not significantly different in the prostate cancer cohort (94.9% vs 90.6% [difference, 4.3%; 95% CI, –2.3% to 10.9%]), but the deep learning method had a higher sensitivity in the melanoma cohort (71.6% vs 53.7% [difference, 17.9%; 95% CI, 1.82% to 34.0%]). The deep learning method had higher sensitivity in the mendelian genes (prostate cancer: 99.7% vs 95.1% [difference, 4.6%; 95% CI, 3.0% to 6.3%]; melanoma: 91.7% vs 86.2% [difference, 5.5%; 95% CI, 2.2% to 8.8%]).

Conclusions and Relevance Among a convenience sample of 2 independent cohorts of patients with prostate cancer and melanoma, germline genetic testing using deep learning, compared with the current standard genetic testing method, was associated with higher sensitivity and specificity for detection of pathogenic variants. Further research is needed to understand the relevance of these findings with regard to clinical outcomes.

It's really difficult to understand this paper since there are many terms that I'd have to research more thoroughly; for example, does "germline whole-exon sequencing" mean that only sperm or egg DNA was sequenced and that every single exon in the entire genome was sequenced? Were exons in noncoding genes also sequenced?

I found it much more useful to look at the accompanying editorial by Gregory Feero.

Feero, W.G. (2020) Bioinformatics, Sequencing Accuracy, and the Credibility of Clinical Genomics. JAMA 324:1945-1947. [doi: 10.1001/jama.2020.19939]

Ferro explains that the main problem is distinguishing real pathogenic variants from false positives and this can only be accomplished by first sequencing and assembling the DNA and then using various algorithms to focus on important variants. Then there's the third step.

The third step, which often requires a high level of clinical expertise, sifts through detected potentially deleterious variations to determine if any are relevant to the indication for testing. For example, exome sequencing ordered for a patient with unexplained cardiomyopathy might harbor deleterious variants in the BRCA1 gene which, while a potentially important incidental finding, does not provide a plausible molecular diagnosis for the cardiomyopathy. The complexity of the bioinformatics tools used in these 3 steps is considerable.

It's that third step that's analyzed in the AlDubayan et al. paper and one of the tools used is a deep-learning (AI) algorithm. However, the training of this algorithm requiries considerable clinical expertise and testing it requires a gold standard set of variants to serve as an internal control. As you might have guessed, that gold standard doesn't exist because the whole point of the genomics is to identify perviously unknown deleterious alleles.

Ferro warns us that "clinical genome sequencing remains largely unregulated and accuracy is highly dependant on the expertise of individual testing laboratories." He concludes that genomics still has a long way to go.

The genomics community needs to act as a coherent body to ensure reproducibility of outcomes from clinical genome or exome sequencing, or provide transparent quality metrics for individual clinical laboratories. Issues related to achieving accuracy are not new, are not limited to bioinformatics tools, and will not be surmounted easily. However, until analytic and clinical validity are ensured, conversations about the potential value that genome sequencing brings to clinical situations will be challenging for clinical centers, laboratories that provide sequencing services, and consumers. For the foreseeable future, nongeneticist clinicians should be familiar with the quality of their chosen genome-sequencing laboratory and engage expert advice before changing patient management based on a test result.

I'm guessing that Gregory Feero doesn't think that in nine years (2030) "The clinical relevance of all encountered genomic variants will be readily predictable."


1. I do NOT recommend this group. It's full of amateurs who resist leaning and one of it's main purposes is to post copies of pirated textbooks in its files. The group members get very angry when you tell them that what they are doing is illegal!

Wednesday, April 07, 2021

Bold predictions for human genomics by 2030

After spending several years working on a book about the human genome I've come to the realization that the field of genomics is not delivering on its promise to help us understand what's in your genome. In fact, genomics researchers have by and large impeded progress by coming up with false claims that need to be debunked.

My view is not widely shared by today's researchers who honestly believe they have made tremendous progress and will make even more as long as they get several billion dollars to continue funding their research. This view is nicely summarized in a Scientific American article from last fall that's really just a precis of an article that first appeared in Nature. The Nature article was written by employees of the National Human Genome Research Institute (NHGRI) at the National Institutes of Health in Bethesda, MD, USA (Green et al., 2020). Its purpose is to promote the work that NHGRI has done in the past and to summarize its strategic vision for the future. At the risk of oversimplifying, the strategic vision is "more of the same."

Green, E.D., Gunter, C., Biesecker, L.G., Di Francesco, V., Easter, C.L., Feingold, E.A., Felsenfeld, A.L., Kaufman, D.J., Ostrander, E.A. and Pavan, W.J. and 20 others (2020) Strategic vision for improving human health at The Forefront of Genomics. Nature 586:683-692. [doi: 10.1038/s41586-020-2817-4]

Starting with the launch of the Human Genome Project three decades ago, and continuing after its completion in 2003, genomics has progressively come to have a central and catalytic role in basic and translational research. In addition, studies increasingly demonstrate how genomic information can be effectively used in clinical care. In the future, the anticipated advances in technology development, biological insights, and clinical applications (among others) will lead to more widespread integration of genomics into almost all areas of biomedical research, the adoption of genomics into mainstream medical and public-health practices, and an increasing relevance of genomics for everyday life. On behalf of the research community, the National Human Genome Research Institute recently completed a multi-year process of strategic engagement to identify future research priorities and opportunities in human genomics, with an emphasis on health applications. Here we describe the highest-priority elements envisioned for the cutting-edge of human genomics going forward—that is, at ‘The Forefront of Genomics’.

What's interesting are the predictions that the NHGRI makes for 2030—predictions that were highlighted in the Scientific American article. I'm going to post those predictions without comment other than saying that I think they are mostly bovine manure. I'm interested in hearing your comments.

Bold predictions for human genomics by 2030

Some of the most impressive genomics achievements, when viewed in retrospect, could hardly have been imagined ten years earlier. Here are ten bold predictions for human genomics that might come true by 2030. Although most are unlikely to be fully attained, achieving one or more of these would require individuals to strive for something that currently seems out of reach. These predictions were crafted to be both inspirational and aspirational in nature, provoking discussions about what might be possible at The Forefront of Genomics in the coming decade.

  1. Generating and analysing a complete human genome sequence will be routine for any research laboratory, becoming as straightforward as carrying out a DNA purification.
  2. The biological function(s) of every human gene will be known; for non-coding elements in the human genome, such knowledge will be the rule rather than the exception.
  3. The general features of the epigenetic landscape and transcriptional output will be routinely incorporated into predictive models of the effect of genotype on phenotype.
  4. Research in human genomics will have moved beyond population descriptors based on historic social constructs such as race.
  5. Studies that involve analyses of genome sequences and associated phenotypic information for millions of human participants will be regularly featured at school science fairs.
  6. The regular use of genomic information will have transitioned from boutique to mainstream in all clinical settings, making genomic testing as routine as complete blood counts.
  7. The clinical relevance of all encountered genomic variants will be readily predictable, rendering the diagnostic designation ‘variant of uncertain significance (VUS)’ obsolete.
  8. An individual’s complete genome sequence along with informative annotations will, if desired, be securely and readily accessible on their smartphone.
  9. Individuals from ancestrally diverse backgrounds will benefit equitably from advances in human genomics.
  10. Breakthrough discoveries will lead to curative therapies involving genomic modifications for dozens of genetic diseases.

I predict that nine years from now (2030) we will still be dealing with scientists who think that most of our genome is functional; that most human protein-coding genes produce many different proteins by alternative splicing; that epigenetics is useful; that there are more noncoding genes than protein-coding genes; that the leading scientists in the 1960 and 70s were incredibly stupid to suggest junk DNA; that almost every transcription factor binding site is biologically relevant; that most transposon-related sequences have a mysterious (still unknown) function; that it's still a mystery why humans are so much more complex than chimps; and that genomics will eventually solve all problems by 2040.

Why in the world, you might ask, would we still be dealing with issues like that? Because of genomics.


Saturday, April 03, 2021

"Dark matter" as an argument against junk DNA

Opponents of junk DNA have been largely unsuccessful in demonstrating that most of our genome is functional. Many of them are vaguely aware of the fact that "no function" (i.e. junk) is the default hypothesis and the onus is on them to come up with evidence of function. In order to shift, or obfuscate, this burden of proof they have increasingly begun to talk about the "dark matter" of the genome. The idea is to pretend that most of the genome is a complete mystery so that you can't say for certain whether it is junk or functional.

One of the more recent attempts appears in the "Journal Club" section of Nature Reviews Genetics. It focuses on repetitive DNA.

Before looking at that article, let's begin by summarizing what we already know about repetitive DNA. It includes highly repetitive DNA consisting of mutliple tandem repeats of short sequences such as ATATATATAT... or CGACGACGACGA ... or even longer repeats. Much of this is located in centromeric regions of the chromosome and I estimate that functional highly repetitve regions make up about 1% of the genome.[see Centromere DNA and Telomeres]

The other part of repetitive DNA is middle repetitive DNA, which is largely composed of transposons and endogenous viruses, although it includes ribosomal RNA genes and origins of replication. Most of these sequences are dispersed as single copies throughout the genome. It's difficult to determine exactly how much of the genome consists of these middle repetitive sequences but it's certainly more than 50%.

Almost all of the transposon- and virus-related sequences are defective copies of once active transposons and viruses. Most of them are just fragments of the originals. They are evolving at the neutral rate so they look like junk and they behave like junk.1 That's not selfish DNA because is doesn't transpose and it's not "dark matter." These fragments have all the characterstics of nonfunctional junk in our genome.

We know that the C-value paradox is mostly explained by differing amounts of repetitive DNA in different genomes and this is consistent with the idea that they are junk. We know that less that 10% of our genome is conserved and this fits in with that conclusion. Finally, we know that genetic load arguments indicate that most our genome must be impervious to mutation. Combined, these are all powerful bits of evidence and logic in favor of repetitive sequences being mostly junk DNA.

Now let's look at what Neil Gemmell says in this article.

Gemmell, N.J. (2021) Repetitive DNA: genomic dark matter matters. Nature Reviews Genetics:1-1. [doi: 10.1038/s41576-021-00354-8]

"Repetitive DNA sequences were found in hundreds of thousands, and sometimes millions, of copies in the genomes of most eukaryotes. while widespread and evolutionarily conserved, the function of these repeats was unknown. Provocatively, Britten and Kohne concluded 'a concept that is repugnant to us is that about half of the DNA of higher organisms is trivial or permanently inert.'”"

That's from Britten and Kohne (1968) and it's true that more than 50 years ago those workers didn't like the idea of junk DNA. Britten argued that most of this repetitive DNA was likely to be involved in regulation. Gemmell goes on to describe centromeres and telomeres and mentions that most repetitive DNA was thought to be junk.

"... the idea that much of the genome is junk, maintained and perpetuated by random chance, seemed as broadly unsatisfactory to me as it had to the original authors. Enthralled by the mystery of why half our genome is repetitive DNA, I have followed this field ever since."

Gemmell is not alone. In spite of all the evidence for junk DNA, the majority of scientists don't like the fact that most of our genome is junk. Here's how he justifies his continued skepticism.

"But it was not until the 2000s, as full eukaryotic genome sequences emerged, that we discovered that the repetitive non-coding regions of our genome harbour large numbers of promoters, enhancers, transcription factor binding sites and regulatory RNAs that control gene expression. More recently, the importance of repetitive DNA in both structural and regulatory processes has emerged, but much remains to be discovered and understood. It is time to shine further light on this genomic dark matter."

This appears to be the ENCODE publicity campaign legacy rearing its ugly head once more. Most Sandwalk readers know that the presence of transcription factor binding sites, RNA polymerase binding sites, and junk RNA is exactly what one would predict from a genome full of defective transposons. Most of us know that a big fat sloppy genome is bound to contain millions of spurious binding sites for transcription factors so this says nothing about function.

Apparently Gemmell's skepticism doesn't apply to the ENCODE results so he still thinks that all those bits and pieces of transposons are mysterious bits of dark matter that could be several billion base pairs of functional DNA. I don't know what he imagines they could be doing.


Photo Credit: The photo shows human chromosomes labelled with a telomere probe (yellow), from Christoher Counter at Duke University.

1. In my book, I cover this in a section called "If it walks like a duck ..." It's a form of abductive reasoning.

Britten, R. and Kohne, D. (1968) Repeated Sequences in DNA. Science 161:529-540. [doi: 10.1126/science.161.3841.529]