More Recent Comments

Wednesday, June 09, 2021

Let's analyze the Newsweek lab leak conspiracy theory article

Lots of people have been sucked in to the lab leak conspiracy theory based on reporting in newspapers and magazines. One of the widely-cited sources is an article published in Newsweek on June 2, 2021. The focus of the article is on How Amateur Sleuths Broke the Wuhan Lab Story and Embarrassed the Media. Those "amateur sleuths" go by the name "Decentralized Radical Autonomous Search Team Investigating COVID-19" or DRASTIC. I'm not interested in them; I'm interested in scientific facts so let's look at all of the so-called "facts" in the Newsweek article. I'll leave it up to you, dear reader, to judge whether the media should be embarrassed by this story.

Newsweek statment #1: Thanks to DRASTIC, we now know that the Wuhan Institute of Virology had an extensive collection of coronaviruses gathered over many years of foraging in the bat caves, and that many of them—including the closest known relative to the pandemic virus, SARS-CoV-2—came from a mineshaft where three men died from a suspected SARS-like disease in 2012.

Some of this is correct. The WIV scientists and their collaborators have been collecting samples from bats all over China and Indochina for several years and many of them have been examined for the presence of coronaviruses. WIV scientists routinely sampled bats from the Yunnan mine cave from 2012 to 2015 after they were informed that four people had been admitted to hospital with severe respiratory disease in 2012 (one of them died). The workers tested negative for Ebola, Nipah virus, and coronavirus so the scientists were looking for a likely unknown virus that caused the infection. (The serum samples were subsequently tested for SARS-CoV-2 and they were negative.)

Several coronaviruses were detected in the bat samples based on short PCR sequences (370 bp) from the RdRp gene and they were classified as either alphacoronaviruses or betacoronaviruses. The data was published in 2016 (Ge et al., 2016) and the sequences were deposited in GenBank in 2016. Improvements in sequencing technology in 2018 prompted a re-examination of those bat samples and an almost full-length sequence of a betacoronavirus was obtained (missing the 5′ and 3′ ends). This virus was named RaTG13 and one of the short GenBank sequences identified as BtCoV/4491 (Accession #KP876546) comes from that virus (Zhou et al., 2020 Addendum).

The bat virus is RaTG13 and it is 96% similar in sequence to SARS-CoV-2—that means that they probably shared a common ancestor about 50 years ago (Zhou et al. 2020). The sequence was deposited in GenBank as Accession #Mn996532. There are parts of SARS-CoV-2 that are not closely related to RaTG13 and this includes the spike protein gene, which is essential for infecting humans. The spike gene sequence is most closely related to a coronavirus from pangolins, Pangolin-Cov.

The data is consistent with a recombination event between different strains of coronaviruses giving rise to SARS-CoV-2 or its immediate ancestors. Such recombinations are a common feature of coronavirus propagation in various animals, including bats. What's clear is that none of the currently known coronavirus sequences could possibly be the ancestors of SARS-CoV-2 so the hunt is on to locate those viruses.

Recently, the scientists at WIV and their collaboratore at the University of Chinese Academy of Sciences in Beijing looked at some of the other samples from bat anal swabs collected in Yunnan in 2015. This in depth analysis was prompted by the discovery of SARS-CoV-2 and the pandemic. They found a number of other bat coronavisus sequences and some of them were more closely related to SARS-CoV-2 in the ORF1b regions but not in other parts of the genome. Again, this is consistent with frequent recombination events that have been documented over the past few decades. Surprisingly, some of these new bat coronavisuses were able to use the bat angiotensin-converting enzyme 2 (ACE2) as a receptor, but they did not bind to human ACE2. (These assays take a lot of time and effort.) This and other data show that the evolution of ACE2 binding can occur in bats giving rise to a generalist virsus, SARS-CoV-2, than can bind to ACE2 from many different species. (MacLean et al., 2021; Guo et al., 2021).

A group of scientists from France, United States, Vietnam, and Cambodia looked at bat samples that were collected in Cambodia in 2010 and found coronaviruses from another species of bats that were cloesly related to SARS-CoV-2 across most of the genome except for a small region of the spike protein gene. In some parts of the genome (ORF1a and ORF8) these viruses were more closely related to SARS-CoV-2 than RATG13 (Hu et al. 2021). The evolutionary history of the Cambodian viruses indicate that they are mosaic viruses due to recombination events. This data indicates that SARS-CoV-2 related viruses are found in Southeast Asia as well as China—that's signficant since pangolins are only found in Southeast Asia and not in China.

SARS-CoV-2-like viruses have also been found in Thailand (Wacharapluesadee et al., 2021).

A group centered in Taian, China, has recently examined coronaviruses from bats at the botanical garden in Mengal county in Yunnan. They have identified four additional SARS-CoV-2 related viruses including one, RpYN06, that is the closest relative to SARS-CoV-2 outside of the spike gene. This is now the leading candidate for the "backbone" that might have given rise to the pandemic virus (Zhou et al., 2021).

CONCLUSION: The Newsweek statement is not wrong but it is highly misleading. The WIV labs had bat samples that contained coronaviruses but so did lots of labs all over the world. In that sense, these labs have an "extensive collection of coronaviruses" but they are stored in bat poop at -80° C! They identified two coronavirus, RaTG13 and RmYN02, by sequencing PCR fragments but the sequences were not complete. It's misleading for Newsweek to imply that the WIV labs had an RaTG13 coronavirus in their labs because that implies that they were working with active viruses. It's true that the RaTG13 virus came from a place where several workers had gotten sick with respiratory disease a few years before the sample was collected. One of these men died (not three) but none of the patients tested positive for coronavirus.

Newsweek statement #2: We know that the WIV was actively working with these viruses, using inadequate safety protocols, in ways that could have triggered the pandemic, and that the lab and Chinese authorities have gone to great lengths to conceal these activities.

CONCLUSION: This is misleading. As far as I know, the scientists are WIV were not actively working with the RaTG13 virus because they had never isolated that virus. Furthermore, it's almost impossible to create SARS-CoV-2 from RaTG13 [Could scientists use the bat coronavirus RaTG13 to engineer SARS-CoV-2, the virus that causes COVID-19, in a lab?]. They were working with other bat coronaviruses but none of them were closely related to SARS-CoV-2 so it's extremely misleading to imply that the escape of these viruses could have triggered the pandemic. They were not using inadequate safety protocols because all of the work with bat coronaviruses was carried out in level 2 labs, exactly as required. There's no evidence that the scientists at the WIV labs have concealed anything. You can only accuse someone of concealing something if you have strong evidence that they did something that they deny doing.

Newsweek statement #3: We know that the first cases appeared weeks before the outbreak at the Huanan wet market that was once thought to be ground zero.

CONCLUSION: This is correct. Chinese scientists and health workers identified a number of earlier cases that appear to be unrelated to the seafood market and they published their results in scientific journals over a year ago. They now conclude that the virus was circulating in the Wuhan population for more than a month before the superspreader event at the market ignited the pandemic. This appears to be a case where Newsweek trusts the work of Chinese scientists.

Newsweek statement #4: The Newsweek article talks a lot about the DRASTIC group as though they have uncovered a huge conspriacy theory. One of their "discoveries" relates to the bat coronavirus RaTG13 that's first mentioned in the paper where the SARS-CoV-2 sequence was published. Here's what Newsweek wrote: "The paper was vague about where RaTG13 had come from. It didn't say exactly where or when RaTG13 had been found, just that it had previously been detected in a bat in Yunnan Province, in southern China.

The paper aroused Deigin's suspicions. He wondered if SARS-CoV-2 might have emerged through some genetic mixing and matching from a lab working with RaTG13 or related viruses. His post was cogent and comprehensive. The Seeker posted Deigin's theory on Reddit, which promptly suspended his account permanently."

CONCLUSION: This is written like it's a big mystery that was uncovered by some clever sleuthing. It's true that the origin of RaTG13 was not discussed in the SARS-CoV-2 paper in January 2020 other than to say that it was found in a bat in Yunnan. I assume that the authors didn't think it was important (and still don't). The origin was explained in November 2020 in an Addendum to the Nature article (Zhou et al., 2020, Addendum). It was one of the viruses discoverd in the bats from the Yunnan mine cave and a partial sequence had been published earlier (Ge et al., 2016). It's not particulary close to SARS-CoV-2 and there's no reason to speculate that it was artificially created unless you are trying to create a conspiracy.

Newsweek statement #5: The key facts quickly came together. The genetic sequence for RaTG13 perfectly matched a small piece of genetic code posted as part of a paper written by Shi Zhengli years earlier, but never mentioned again. The code came from a virus the WIV had found in a Yunnan bat. Connecting key details in the two papers with old news stories, the DRASTIC team determined that RaTG13 had come from a mineshaft in Mojiang County, in Yunnan Province, where six men shoveling bat guano in 2012 had developed pneumonia. Three of them died. DRASTIC wondered if that event marked the first cases of human beings being infected with a precursor of SARS-CoV-2—perhaps RaTG13 or something like it.

In a profile in Scientific American, Shi Zhengli acknowledged working in a mineshaft in Mojiang County where miners had died. But she avoided connecting it to RaTG13 (an omission she had made in her scientific papers as well), claiming that a fungus in the cave had killed the miners.

This reads just like a typical conspiracy theory where "clever" sleuths (i.e. internet anateurs) uncover information that was hidden or covered up by those they are accusing. The origin of RaTG13 was explained in an addendum to the publication of the SARS-CoV-2 sequence in February 2020. The addendum was added in November 2020 in reponse to questions about the origin of RaTG13 but that information was widely known. The sequence of a short fragment of this virus was obtained earlier as explained above.

The WIV scientists were very concerned about the Yunnan mine workers because they had symptoms that were similar to those of SARS patients and that's why they tested serum from the patients. They were negative for all the viruses, including the original SARS-CoV-1. (The serum is also negative for SARS-CoV-2.) The WIV scientists were worried that the infections were due to an unknown virus that could cause a pandemic so they went back to the mine every year to collect samples from the bats. The RaTG13 sequence came from one of those samples but by then the scientists knew that there was no connection between the bat coronaviruses and the sick mine workers. (They were probably disappointed at the lack of connection because they were looking for the cause of the 2002 SARS outbreak.)

The WIV scientists now believe that the Yunnan mine workers had contracted a fungal infection from the fungus growing on the bat guano. There is no reason to connect RaTG13 to the mine workers because it's been known for many years that the workers were not infected with any coronavirus.

The RaTG13 virus is from the bat species Rhinolophus affinis (hence the designation "Ra") but up until the beginning of the pandemic the WIV scientists were much more interested in another cave in Yunnan populated by a number of different species. They reported that this cave represents the most diverse collection of bat coronaviruses in the world. Most of the ones that are SARS-like were from a different species of bat, Rhinolophus sinicus and many of these bound the same ACE2 receptor that SARS-CoV-1 used—the same one used by the more recent SARS-CoV-2 (Hu et al. 2017; Cui et al., 2019).

CONCLUSION: The Newsweek article is repeating innuendos and conspiracies that have been discredited in the past. The DRASTIC team is deliberately making up connections between coroanvirus and the mine workers but all of the data shows that there's no direct connection. It just happened that one of the bat coronaviruses collected in that mine happened to be the one closest to SARS-CoV-2, in part because that was a pretty extensive collection. The RaTG13 sequence is not similar enough to SAS-CoV-2 to be the direct ancestor and, besides, there are now known to be other virus sequences from as far away as Cambodia that are just as similar to SARS-CoV-2.

Newsweek statement #6: That explanation didn't sit well with the DRASTIC group. They suspected a SARS-like virus, not a fungus, had killed the miners and that, for whatever reason, the WIV was trying to hide that fact. It was a hunch, and they had no way of proving it.

At this point, The Seeker revealed his research powers to the group. In his online explorations, he'd recently discovered a massive Chinese database of academic journals and theses called CNKI. Now he wondered if somewhere in its vast circuitry might be information on the sickened miners.

Working through the night at his bedside table on phone and laptop, fueled by chai and using Chinese characters with the help of Google Translate, he plugged in "Mojiang"—the county where the mine was located—in combination with every other word he could think of that might be relevant, instantly translating each new flush of results back to English. "Mojiang + pneumonia"; "Mojiang + WIV"; "Mojiang + bats"; "Mojiang + SARS." Each search brought back thousands of results and half a dozen different databases for journals, books, newspapers, master's theses, doctoral dissertations. He combed through these results, night after night, but never found anything useful. When he ran out of energy, he broke for arcade games and more chai.

He was on the verge of calling it quits, he says, when he struck gold: a 60-page master's thesis written by a student at Kunming Medical University in 2013 titled "The Analysis of 6 Patients with Severe Pneumonia Caused by Unknown Viruses." In exhaustive detail, it described the conditions and step-by-step treatment of the miners. It named the suspected culprit: "Caused by SARS-like [coronavirus] from the Chinese horseshoe bat or other bats."

CONCLUSION: Move along folks; there's nothing to see here. The WIV scientists suspected that the miners were infected with an unknown virus and that's why they were concerned in 2012. They knew that coronavirus wasn't responsible and neither was any other known virus. This is why they went back every year to test the bats in the mine shaft. The know that the stored serum from these workers is negative for SARS-CoV-2, which is not a surprise. They now suspect that the mine workers had contracted a fungl infection and not a viral infection. It's not particulary surprising that a student reported the suspected cause of the symptoms back in the beginning of the investigation.

Newsweek statement #7: Ribera was responsible for solving another piece of the RaTG13 puzzle. Had the WIV been actively working on RaTG13 during the seven years since they discovered it? Peter Daszak said no: they had never used the virus because it wasn't similar enough to the original SARS. "We thought it's interesting, but not high-risk," he told Wired. "So we didn't do anything about it and put it in the freezer."

Ribera disproved that account. When a new science paper on genetics is published, the authors must upload the accompanying genetic sequences to an international database. By examining some metadata tags that had been accidentally uploaded by the WIV along with its genetic sequences for RaTG13, Ribera discovered that scientists at the lab had indeed been actively studying the virus in 2017 and 2018—they hadn't stuck it in a freezer and forgotten about it, after all.

I don't know what this means. The WIV scientists sequenced a bit of what turned out to be the RaTG13 virus when they catagorized all the other viruses back in 2012-2015 (Ge et al. 2016). They then completed an almost whole genome sequence later on in 2018 when their sequencing techniques improved. It's important keep in mind that the WIV never worked with the RaTG13 virus as emphasized by Frutos et al. (2021): "One must remember that SARS-CoV-2 was never found in the wild and that RaTG13 does not exist as a real virus but instead only as a sequence in a computer. It is a virtual virus which thus cannot leak from a laboratory." 1

CONCLUSION: The scientists at WIV were "working with" the RaTG13 PCR fragments in 2017 and 2018 as they assembled the whole genome sequence. They also assembled the sequences of seveal other viruses at the same time. To say that they were "actively studying" the virus is very misleading and to accuse Peter Daszek of lying is irresponsible.

Newsweek statement #8: In fact, the WIV had been intensely interested in RaTG13 and everything else that had come from the Mojiang mineshaft. From his giant Sudoku puzzle, Ribera determined that they made at least seven different trips to the mine, over many years, collecting thousands of samples. Ribera's guess is that their technology had not been good enough in 2012 and 2013 to find the virus that had killed the miners, so they kept going back as the techniques improved.

He also made a bold prediction. Cross-referencing snippets of information from multiple sources, Ribera guessed, in a Twitter thread dated August 1, 2020, that a cluster of eight SARS-related viruses mentioned briefly in an obscure section of one WIV paper had actually also come from the Mojiang mine. In other words, they hadn't found one relative of SARS-CoV-2 in that mineshaft; they'd found nine. In November 2020, Shi Zhengli confirmed many of DRASTIC's suspicions about the Mojiang cave in an addendum to her original paper on RaTG13 and in a talk in February 2021.

The mine shaft is located in Mojiang county, Yunnan—a map of the location was published in Ge et al. (2016). It contains six different bat species and many of them were infected with coronaviruses. The WIV scientists collected many samples over a number of years in order to determine the phylogeny of the viruses and which species were infected. They also did longitudinal studies to see if the different virus variants changed over time and to see if the infection rates of the various bat species were different from year to year. They also wanted to see if they could detect recombinations between different virus groups.

They obtained 152 partial sequences and then picked 12 of them for more detailed analysis in order to construct a phylogenetic tree from 816 bp of the RNA-dependent RNA polymerase (RdRp) gene. Anyone can read the Ge et al. (2016) paper to see why they were doing these experiments. There's nothing mysterious or unusual about their approach. It's the same one they took with the viruses from the other site (cave) in Yunnan where they identified the two bat coronaviruses that are most closely related to the original SARS virus (Ge et al., 2013) (see: SARS ouotbreak linked to Chinese bat cave)

CONCLUSION: The Newsweek article is making a huge mountain out of a molehill and it's misrepresenting the work of the "amateur sleuths." It's not a secret or a mystery that the WIV scientists were studying the coronaviruses from the mine shaft. That's what they do and they publish in journals that are easy to access.

Newsweek statement #9: "Other databases yielded other clues. In the WIV's grant applications and awards, The Seeker found detailed descriptions of the Institute's research plans, and they were damning: Projects were underway to test the infectivity of novel SARS-like viruses they'd discovered in human cells and in lab animals, to see how they might mutate as they crossed species, and to genetically recombine pieces of different viruses—all being done at woefully inadequate biosecurity levels. All the elements for a disaster were on hand."

CONCLUSION: It's true that the WIV scientists were looking at SARS-like coronavisuses and they were testing for infectivity in humanized mouse cells. The goal was to look for new coronaviruses that could bind ACE2 and they found quite a few of them. In many cases, they expressed the spike protein in recombinant viruses and plasmids just as you would expect them to do if they were looking for the source of the original SARS virus (SARS-CoV-1). All this is described in their grant applications and in their publications. Looks like they didn't make much of an attempt to hide this research. All the experiments were done under the appropriate biosafety measures as specified by international inspectors who visited the lab on several occasions. None of this has anything to do with the pandemic because they were not working with SARS-CoV-2 or any close relative.

The rest of the Newsweek article consists mostly of praise for the DRASTIC heros and the excellent work they have done in uncovering a huge conspiracy to cover up the fact that the WIV scientists started a pandemic. However, one embarrassing fact remains: there is not a shred of evidence that the lab was working with SARS-CoV-2 before the pandemic started. In the absence of such evidence it is irresponsible to accuse these reputable scientists of lying.


1. One could quibble slightly about the accuracy of this statment since there might be RaTG13 virus particles in the bat fecal samples that are stored in the -80°C freezer.

Cui, J., Li, F. and Shi, Z.-L. (2019) Origin and evolution of pathogenic coronaviruses. Nature Reviews Microbiology 17:181-192. doi: [doi: 10.1038/s41579-018-0118-9]

Severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV) are two highly transmissible and pathogenic viruses that emerged in humans at the beginning of the 21st century. Both viruses likely originated in bats, and genetically diverse coronaviruses that are related to SARS-CoV and MERS-CoV were discovered in bats worldwide. In this Review, we summarize the current knowledge on the origin and evolution of these two pathogenic coronaviruses and discuss their receptor usage; we also highlight the diversity and potential of spillover of bat-borne coronaviruses, as evidenced by the recent spillover of swine acute diarrhoea syndrome coronavirus (SADS-CoV) to pigs.

Hu, V., Delaune, D., Karlsson, E.A., Hassanin, A., Tey, P.O., Baidaliuk, A., Gámbaro, F., Tu, V.T., Keatts, L. and Mazet, J. (2021) A novel SARS-CoV-2 related coronavirus in bats from Cambodia. bioRxiv. [doi: 10.1101/2021.01.26.428212]

Knowledge of the origin and reservoir of the coronavirus responsible for the ongoing COVID-19 pandemic is still fragmentary. To date, the closest relatives to SARS-CoV-2 have been detected in Rhinolophus bats sampled in the Yunnan province, China. Here we describe the identification of SARS-CoV-2 related coronaviruses in two Rhinolophus shameli bats sampled in Cambodia in 2010. Metagenomic sequencing identified nearly identical viruses sharing 92.6% nucleotide identity with SARS-CoV-2. Most genomic regions are closely related to SARS-CoV-2, with the exception of a small region corresponding to the spike N terminal domain. The discovery of these viruses in a bat species not found in China indicates that SARS-CoV-2 related viruses have a much wider geographic distribution than previously understood, and suggests that Southeast Asia represents a key area to consider in the ongoing search for the origins of SARS-CoV-2, and in future surveillance for coronaviruses.

Ge, X.-Y., Li, J.-L., Yang, X.-L., Chmura, A.A., Zhu, G., Epstein, J.H., Mazet, J.K., Hu, B., Zhang, W. and Peng, C. (2013) Isolation and characterization of a bat SARS-like coronavirus that uses the ACE2 receptor. Nature 503:535-538. [doi: 10.1038/nature12711]

The 2002–3 pandemic caused by severe acute respiratory syndrome coronavirus (SARS-CoV) was one of the most significant public health events in recent history1. An ongoing outbreak of Middle East respiratory syndrome coronavirus2 suggests that this group of viruses remains a key threat and that their distribution is wider than previously recognized. Although bats have been suggested to be the natural reservoirs of both viruses3,4,5, attempts to isolate the progenitor virus of SARS-CoV from bats have been unsuccessful. Diverse SARS-like coronaviruses (SL-CoVs) have now been reported from bats in China, Europe and Africa5,6,7,8, but none is considered a direct progenitor of SARS-CoV because of their phylogenetic disparity from this virus and the inability of their spike proteins to use the SARS-CoV cellular receptor molecule, the human angiotensin converting enzyme II (ACE2)9,10. Here we report whole-genome sequences of two novel bat coronaviruses from Chinese horseshoe bats (family: Rhinolophidae) in Yunnan, China: RsSHC014 and Rs3367. These viruses are far more closely related to SARS-CoV than any previously identified bat coronaviruses, particularly in the receptor binding domain of the spike protein. Most importantly, we report the first recorded isolation of a live SL-CoV (bat SL-CoV-WIV1) from bat faecal samples in Vero E6 cells, which has typical coronavirus morphology, 99.9% sequence identity to Rs3367 and uses ACE2 from humans, civets and Chinese horseshoe bats for cell entry. Preliminary in vitro testing indicates that WIV1 also has a broad species tropism. Our results provide the strongest evidence to date that Chinese horseshoe bats are natural reservoirs of SARS-CoV, and that intermediate hosts may not be necessary for direct human infection by some bat SL-CoVs. They also highlight the importance of pathogen-discovery programs targeting high-risk wildlife groups in emerging disease hotspots as a strategy for pandemic preparedness.

Ge, X.-Y., Wang, N., Zhang, W., Hu, B., Li, B., Zhang, Y.-Z., Zhou, J.-H., Luo, C.-M., Yang, X.-L. and Wu, L.-J. (2016) Coexistence of multiple coronaviruses in several bat colonies in an abandoned mineshaft. Virologica Sinica 31:31-40. [doi: 10.1007/s12250-016-3713-9]

Since the 2002–2003 severe acute respiratory syndrome (SARS) outbreak prompted a search for the natural reservoir of the SARS coronavirus, numerous alpha- and betacoronaviruses have been discovered in bats around the world. Bats are likely the natural reservoir of alpha- and beta-coronaviruses, and due to the rich diversity and global distribution of bats, the number of bat coronaviruses will likely increase. We conducted a surveillance of coronaviruses in bats in an abandoned mineshaft in Mojiang County, Yunnan Province, China, from 2012–2013. Six bat species were frequently detected in the cave: Rhinolophus sinicus, Rhinolophus affinis, Hipposideros pomona, Miniopterus schreibersii, Miniopterus fuliginosus, and Miniopterus fuscus. By sequencing PCR products of the coronavirus RNA-dependent RNA polymerase gene (RdRp), we found a high frequency of infection by a diverse group of coronaviruses in different bat species in the mineshaft. Sequenced partial RdRp fragments had 80%–99% nucleic acid sequence identity with well-characterized Alphacoronavirus species, including BtCoV HKU2, BtCoV HKU8, and BtCoV1,and unassigned species BtCoV HKU7 and BtCoV HKU10. Additionally, the surveillance identified two unclassified betacoronaviruses, one new strain of SARS-like coronavirus, and one potentially new betacoronavirus species. Furthermore, coronavirus co-infection was detected in all six batspecies, a phenomenon that fosters recombination and promotes the emergence of novel virus strains. Our findings highlight the importance of bats as natural reservoirs of coronaviruses and the potentially zoonotic source of viral pathogens.

Guo, H., Hu, B., Si, H.-r., Zhu, Y., Zhang, W., Li, B., Li, A., Geng, R., Lin, H.-F. and Yang, X.-L. (2021) Identification of a novel lineage bat SARS-related coronaviruses that use bat ACE2 receptor. bioRxiv. [doi: 10.1101/2021.05.21.445091]

Severe respiratory disease coronavirus-2 (SARS-CoV-2) causes the most devastating disease, COVID-19, of the recent century. One of the unsolved scientific questions around SARS-CoV-2 is the animal origin of this virus. Bats and pangolins are recognized as the most probable reservoir hosts that harbor the highly similar SARS-CoV-2 related viruses (SARSr-CoV-2). Here, we report the identification of a novel lineage of SARSr-CoVs, including RaTG15 and seven other viruses, from bats at the same location where we found RaTG13 in 2015. Although RaTG15 and the related viruses share 97.2% amino acid sequence identities to SARS-CoV-2 in the conserved ORF1b region, but only show less than 77.6% to all known SARSr-CoVs in genome level, thus forms a distinct lineage in the Sarbecovirus phylogenetic tree. We then found that RaTG15 receptor binding domain (RBD) can bind to and use Rhinolophus affinis bat ACE2 (RaACE2) but not human ACE2 as entry receptor, although which contains a short deletion and has different key residues responsible for ACE2 binding. In addition, we show that none of the known viruses in bat SARSr-CoV-2 lineage or the novel lineage discovered so far use human ACE2 efficiently compared to SARSr-CoV-2 from pangolin or some of the SARSr-CoV-1 lineage viruses. Collectively, we suggest more systematic and longitudinal work in bats to prevent future spillover events caused by SARSr-CoVs or to better understand the origin of SARS-CoV-2.

MacLean, O.A., Lytras, S., Weaver, S., Singer, J.B., Boni, M.F., Lemey, P., Pond, S.L.K. and Robertson, D.L. (2021) Natural selection in the evolution of SARS-CoV-2 in bats created a generalist virus and highly capable human pathogen. PLoS Biology 19:e3001115. [doi: 10.1371/journal.pbio.3001115]

Virus host shifts are generally associated with novel adaptations to exploit the cells of the new host species optimally. Surprisingly, Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has apparently required little to no significant adaptation to humans since the start of the Coronavirus Disease 2019 (COVID-19) pandemic and to October 2020. Here we assess the types of natural selection taking place in Sarbecoviruses in horseshoe bats versus the early SARS-CoV-2 evolution in humans. While there is moderate evidence of diversifying positive selection in SARS-CoV-2 in humans, it is limited to the early phase of the pandemic, and purifying selection is much weaker in SARS-CoV-2 than in related bat Sarbecoviruses. In contrast, our analysis detects evidence for significant positive episodic diversifying selection acting at the base of the bat virus lineage SARS-CoV-2 emerged from, accompanied by an adaptive depletion in CpG composition presumed to be linked to the action of antiviral mechanisms in these ancestral bat hosts. The closest bat virus to SARS-CoV-2, RmYN02 (sharing an ancestor about 1976), is a recombinant with a structure that includes differential CpG content in Spike; clear evidence of coinfection and evolution in bats without involvement of other species. While an undiscovered “facilitating” intermediate species cannot be discounted, collectively, our results support the progenitor of SARS-CoV-2 being capable of efficient human–human transmission as a consequence of its adaptive evolutionary history in bats, not humans, which created a relatively generalist virus.

Wacharapluesadee, S., Tan, C.W., Maneeorn, P., Duengkae, P., Zhu, F., Joyjinda, Y., Kaewpom, T., Chia, W.N., Ampoot, W. and Lim, B.L. (2021) Evidence for SARS-CoV-2 related coronaviruses circulating in bats and pangolins in Southeast Asia. Nature communications 12:1-9. doi: [doi: 10.1038/s41467-021-21240-1]

Among the many questions unanswered for the COVID-19 pandemic are the origin of SARS-CoV-2 and the potential role of intermediate animal host(s) in the early animal-to-human transmission. The discovery of RaTG13 bat coronavirus in China suggested a high probability of a bat origin. Here we report molecular and serological evidence of SARS-CoV-2 related coronaviruses (SC2r-CoVs) actively circulating in bats in Southeast Asia. Whole genome sequences were obtained from five independent bats (Rhinolophus acuminatus) in a Thai cave yielding a single isolate (named RacCS203) which is most related to the RmYN02 isolate found in Rhinolophus malayanus in Yunnan, China. SARS-CoV-2 neutralizing antibodies were also detected in bats of the same colony and in a pangolin at a wildlife checkpoint in Southern Thailand. Antisera raised against the receptor binding domain (RBD) of RmYN02 was able to cross-neutralize SARS-CoV-2 despite the fact that the RBD of RacCS203 or RmYN02 failed to bind ACE2. Although the origin of the virus remains unresolved, our study extended the geographic distribution of genetically diverse SC2r-CoVs from Japan and China to Thailand over a 4800-km range. Cross-border surveillance is urgently needed to find the immediate progenitor virus of SARS-CoV-2.

Zhou, P., Yang, X.-L., Wang, X.-G., Hu, B., Zhang, L., Zhang, W., Si, H.-R., Zhu, Y., Li, B. and Huang, C.-L. (2020) A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 579:270-273. [doi: 10.1038/s41586-020-2012-7]

Since the outbreak of severe acute respiratory syndrome (SARS) 18 years ago, a large number of SARS-related coronaviruses (SARSr-CoVs) have been discovered in their natural reservoir host, bats1,2,3,4. Previous studies have shown that some bat SARSr-CoVs have the potential to infect humans5,6,7. Here we report the identification and characterization of a new coronavirus (2019-nCoV), which caused an epidemic of acute respiratory syndrome in humans in Wuhan, China. The epidemic, which started on 12 December 2019, had caused 2,794 laboratory-confirmed infections including 80 deaths by 26 January 2020. Full-length genome sequences were obtained from five patients at an early stage of the outbreak. The sequences are almost identical and share 79.6% sequence identity to SARS-CoV. Furthermore, we show that 2019-nCoV is 96% identical at the whole-genome level to a bat coronavirus. Pairwise protein sequence analysis of seven conserved non-structural proteins domains show that this virus belongs to the species of SARSr-CoV. In addition, 2019-nCoV virus isolated from the bronchoalveolar lavage fluid of a critically ill patient could be neutralized by sera from several patients. Notably, we confirmed that 2019-nCoV uses the same cell entry receptor—angiotensin converting enzyme II (ACE2)—as SARS-CoV.

Zhou, P. et al. (2020) Addendum: A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 588:E6-E6. [doi: 10.1038/s41586-020-2951-z]

Zhou, H., Ji, J., Chen, X., Bi, Y., Li, J., Hu, T., Song, H., Chen, Y., Cui, M. and Zhang, Y. (2021) Identification of novel bat coronaviruses sheds light on the evolutionary origins of SARS-CoV-2 and related viruses. bioRxiv. doi: [doi: 10.1101/2021.03.08.434390]

Although a variety of SARS-CoV-2 related coronaviruses have been identified, the evolutionary origins of this virus remain elusive. We describe a meta-transcriptomic study of 411 samples collected from 23 bat species in a small (~1100 hectare) region in Yunnan province, China, from May 2019 to November 2020. We identified coronavirus contigs in 40 of 100 sequencing libraries, including seven representing SARS-CoV-2-like contigs. From these data we obtained 24 full-length coronavirus genomes, including four novel SARS-CoV-2 related and three SARS-CoV related genomes. Of these viruses, RpYN06 exhibited 94.5% sequence identity to SARS-CoV-2 across the whole genome and was the closest relative of SARS-CoV-2 in the ORF1ab, ORF7a, ORF8, N, and ORF10 genes. The other three SARS-CoV-2 related coronaviruses were nearly identical in sequence and clustered closely with a virus previously identified in pangolins from Guangxi, China, although with a genetically distinct spike gene sequence. We also identified 17 alphacoronavirus genomes, including those closely related to swine acute diarrhea syndrome virus and porcine epidemic diarrhea virus. Ecological modeling predicted the co-existence of up to 23 Rhinolophus bat species in Southeast Asia and southern China, with the largest contiguous hotspots extending from South Lao and Vietnam to southern China. Our study highlights both the remarkable diversity of bat viruses at the local scale and that relatives of SARS-CoV-2 and SARS-CoV circulate in wildlife species in a broad geographic region of Southeast Asia and southern China. These data will help guide surveillance efforts to determine the origins of SARS-CoV-2 and other pathogenic coronaviruses.

Saturday, June 05, 2021

Real scientists discuss the lab leak conspiracy theory

Here's an interesting video where the hosts of "This Week in Virology" (Vincent Racaniello, Rich Condit, and Kathy Spindler) discuss the origin of COVID-19 with three scientists who were on the WHO investigation committee that visited the Wuhan Institute of Virology a few months ago (Peter Daszak, Thea Kølsen Fischer, and Marion Koopmans). If you've fallen for the lab leak conspiracy theory then you need to watch the entire video. The rest of you might want to skip to 50 minutes where they discuss the lab leak accusation and relate how they interviewed the scientists at WIV.

The WHO scientists want to emphasize three things: (1) it is extremely unlikely that SARS-CoV-2 was being studied at WIV so it couldn't have escaped from there; (2) there is no evidence to support the lab leak conspiracy theory but if any evidence shows up they are perfectly willing to investigate; (3) it's very likely that SARS-CoV-2 originated naturally in the wild and all efforts should be focused on the most likely scenario and not on an extremely unlikely scenario.

After the interview is over, the three hosts talk about the lab leak conspiracy theory. You should hear what they have to say about Nicholas Wade and his failure to understand the furin cleavage site (1:10 minutes)! And they have lots to say about everything else in the Wade article. Everyone needs to watch that discussion if you are really interested in science and not half-baked conspriacy theories.

The next video is an interview with Robert Garry, a virologist at Tulane University in New Orleans. His area of expertise is emerging infectious viruses. Listen to Garry and the hosts discuss the possibility that SARS-CoV-2 was present in the Wuhan Institure of Virology and released acidentally to start the pandemic (starting at 15 mins). It's good to hear real experts debunk the conspiracy theory.


Monday, May 31, 2021

Nessa Carey talks about epigenetics

Nessa Carey wrote a horribe book about junk DNA where she completely misunderstood the science. It's one of many examples of bad science journalism [Nessa Carey doesn't understand junk DNA].

I recently became aware of a talk given in 2015 by Nessa Carey on epigenetics so I'm posting it here. (She also wrote a book about epigenetics.) She is an entertaining speaker and gives a very good presentation but that's a problem if the science is misleading. Judge for yourselves.


Sunday, May 30, 2021

Telomere-to-telomere sequencing of a complete human genome

Here's a paper that has recently been posted on the preprint server bioRxiv.

Nurk et al. (2021) The complete sequence of a human genome. [doi: 10.1101/2021.05.26.445798]

I usually don't like to comment on preprints but this one is surely going to be published somewhere and it's important.

The authors have sequenced the entire chromosomes (telomere-to-telomere) of the 22 autosomes and the X chromosome of the cell line CHM13. The cell line is a complete hydatiform mole, which means it is derived from a molar pregnancy where a sperm combines with an egg cell that has lost its nucleus. The sperm DNA duplicates giving rise to cells that have two identical copies of each chromosome. The karyotype of the CHM13 cell line is 46,XX. The advantage of sequencing the DNA from such cell lines is that the interpretation of the sequencing results is not complicated by the heterogeneity of normal diploid cell lines. This was important because the focus of this study was on sequencing repetitive regions of the chromosomes and most chromosome pairs have different numbers of repeats.

Wednesday, May 26, 2021

The SARS-CoV-2 reference genome

Chinese scientists isolated virus particles from a patient admitted to hospital on December 26, 2019 in Wuhan, China. The RNA genome was sequenced and the sequence was immediately distributed to interested scientists around the world. It was submitted to GenBank on January 5, 2020 and appeared as entry NC_045512 on January 13, 2020 [Wuhan seafood market pneumonia virus isolate Wuhan-Hu-1, complete genome].

The original GenBank record was annotated and updated by NIH staff on January 17, 2020 and now appears as updated locus NC_045512 last modified on July 18, 2020 now called SARS-CoV-2 [Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome].

The sequence was extensively mapped and analyzed by Chinese scientists in Shanghai, Wuhan, and Beijing, and Ed Holmes in Sydney, Australia and the results were submitted to Nature on January 7, 2020 and published on February 3, 2020.

Wu, F., Zhao, S., Yu, B., Chen, Y.-M., Wang, W., Song, Z.-G., Hu, Y., Tao, Z.-W., Tian, J.-H., Pei, Y.-Y., Yuan, M.-L., Zhang, Y.-L., Dai, F.-H., Liu, Y., Wang, Q.-M., Zheng, J.-J., Xu, L., Holmes, E.C. and Zhang, Y.-Z. (2020) A new coronavirus associated with human respiratory disease in China. Nature 579:265-269. [doi: 10.1038/s41586-020-2008-3]

Emerging infectious diseases, such as severe acute respiratory syndrome (SARS) and Zika virus disease, present a major threat to public health1–3. Despite intense research efforts, how, when and where new diseases appear are still a source of considerable uncertainty. A severe respiratory disease was recently reported in Wuhan, Hubei province, China. As of 25 January 2020, at least 1,975 cases had been reported since the first patient was hospitalized on 12 December 2019. Epidemiological investigations have suggested that the outbreak was associated with a seafood market in Wuhan. Here we study a single patient who was a worker at the market and who was admitted to the Central Hospital of Wuhan on 26 December 2019 while experiencing a severe respiratory syndrome that included fever, dizziness and a cough. Metagenomic RNA sequencing of a sample of bronchoalveolar lavage fluid from the patient identified a new RNA virus strain from the family Coronaviridae, which is designated here ‘WH-Human 1’ coronavirus (and has also been referred to as ‘2019-nCoV’). Phylogenetic analysis of the complete viral genome (29,903 nucleotides) revealed that the virus was most closely related (89.1% nucleotide similarity) to a group of SARS-like coronaviruses (genus Betacoronavirus, subgenus Sarbecovirus) that had previously been found in bats in China. This outbreak highlights the ongoing ability of viral spill-over from animals to cause severe disease in humans.

The Nature paper notes that this is a novel coronavirus related to known bat coronaviruses but it's exact origin remains unclear. The authors also mention that the origin of other disease-causing coronavirus-like viruses is also unknown.

Coronaviruses are associated with a number of infectious disease outbreaks in humans, including SARS in 2002–2003 and Middle East respiratory syndrome (MERS) in 2012. Four other coronaviruses—human coronaviruses HKU1, OC43, NL63 and 229E—are also associated with respiratory disease. Although SARS-like coronaviruses have been widely identified in mammals including bats since 2005 in China, the exact origin of human-infected coronaviruses remains unclear. Here we describe a new coronavirus—WHCV—in the BALF from a patient who experienced severe respiratory disease in Wuhan, China. Phylogenetic analysis suggests that WHCV is a member of the genus Betacoronavirus (subgenus Sarbecovirus) that has some genomic and phylogenetic similarities to SARS-CoV1, particularly in the RBD of the spike protein. These genomic and clinical similarities to SARS, as well as its high abundance in clinical samples, provides evidence for an association between WHCV and the ongoing outbreak of respiratory disease in Wuhan and across the world. Although the isolation of the virus from only a single patient is not sufficient to conclude that it caused these respiratory symptoms, our findings have been independently corroborated in further patients in a separate study.

The identification of multiple SARS-like CoVs in bats have led to the idea that these animals act as hosts of a natural reservoir of these viruses. Although SARS-like viruses have been identified widely in bats in China, viruses identical to SARS-CoV have not yet been documented. Notably, WHCV is most closely related to bat coronaviruses, and shows 100% amino acid similarity to bat SL-CoVZC45 in the nsp7 and E proteins (Supplementary Table 3). Thus, these data suggest that bats are a possible host for the viral reservoir of WHCV. However, as a variety of animal species were for sale in the market when the disease was first reported, further studies are needed to determine the natural reservoir and any intermediate hosts of WHCV.

Subsequent work suggests that the virus did not originate in the Wuhan market but was circulating in Wuhan in November 2019 among a small number of people who were not associated with the market. It looks like market workers were the source of superspreader event.

It's important to keep in mind that the exact origin of several other viral diseases has never been determined. This is quite normal so don't be fooled by people who think that the mysterious origin of SARS-CoV-2 demands an immediate explanation. That's likely not going to happen no matter how many outside investigators go snooping around Wuhan looking for clues to support their favorite conspiracy theory.


Monday, May 10, 2021

MIT Professor Rick Young doesn't understand junk DNA

Richard ("Rick") Young is a Professor of Biology at the Massachusetts Institute of Technology and a member of the Whitehead Institute. His area of expertise is the regulation of gene expression in eukaryotes.

He was interviewed by Jorge Conde and Hanne Winarsky on a recent podcast (Feb. 1, 2021) where the main topic was "From Junk DNA to an RNA Revolution." They get just about everything wrong when they talk about junk DNA including the Central Dogma, historical estimates of the number of genes, confusing noncoding DNA with junk, alternative splicing, the number of functional RNAs, the amount of regulatory DNA, and assuming that scientists in the 1970s were idiots.

In this episode, a16z General Partner Jorge Conde and Bio Eats World host Hanne Winarsky talk to Professor Rick Young, Professor of Biology and head of the Young Lab at MIT—all about “junk” DNA, or non-coding DNA.

Which, it turns out—spoiler alert—isn’t junk at all. Much of this so-called junk DNA actually encodes RNA—which we now know has all sorts of incredibly important roles in the cell, many of which were previously thought of as only the domain of proteins. This conversation is all about what we know about what that non-coding genome actually does: how RNA works to regulate all kinds of different gene expression, cell types, and functions; how this has dramatically changed our understanding of how disease arises; and most importantly, what this means we can now do—programming cells, tuning functions up or down, or on or off. What we once thought of as “junk” is now giving us a powerful new tool in intervening in and treating disease—bringing in a whole new category of therapies.

Here's what I don't understand. How could a prominent scientist at one of the best universities in the world be so ignorant of a topic he chooses to discuss on a podcast? Perhaps you could excuse a busy scientist who doesn't have the time to research the topic but what excuse can you offer to explain why the entire culture at MIT and the Whitehead must also be ignorant? Does nobody there ever question their own ideas? Do they only read the papers that support their views and ignore all those that challenge those views?

This is a very serious question. It's the most difficult question I discuss in my book. Why has the false narrative about junk DNA, and many other things, dominated the scientific literature and become accepted dogma among leading scientists? Soemething is seriously wrong with science.


Saturday, May 08, 2021

World Health Organization (WHO) report on the natural origin theory of SARS-CoV-2

The origin of SARS-Cov-2 is a hot topic these days. As far as I can tell, the consensus view among the experts is that the ancestor is from bats but it evolved in an intermediate host before jumping to humans. However, there's a vocal group who claim that the virus was engineered in a lab in the Wuhan Institute of Virology and accidentally escaped causing a pandemic. A group of scientists from WHO investigated this speculation and decided that it was "extremely unlikely." I posted a summary of their analysis a few days ago [World Health Organization (WHO) report on the lab leak conspiracy theory].

That's not going to put an end to the speculation since proponents of the lab leak hypothesis are now saying that the WHO report, and the opinion of other experts, can't be trusted. They claim that there's a widespread consiracy to lie and cover up the fact that SARS-CoV-2 was created in a lab and leaked to the Wuhan population.

There's not much point in arguing with people once they go down the conspiricy theory path since they will refute every argument by claiming that it's part of the conspriacy. However, it's worth pointing out that there's a perfectly valid alternative explanation; namely, natural origin. For those who still have an open mind I'm posting the explanation of the WHO scientific team who conclude that this is the most likely explanation [WHO-convened global study of origins of SARS-CoV-2: China Part].

Introduction through intermediate host followed by zoonotic transmission

Explanation of hypothesis

SARS-CoV-2 is transmitted from an animal reservoir to an animal host, followed by subsequent spread within that intermediate host (spillover host), and then transmission to humans. The passage through an intermediate host can be without or with virus adaptation.

Arguments in favour

Although the closest related viruses have been found in bats, the evolutionary distance between these bat viruses and SARS-CoV-2 is estimated to be several decades, suggesting a missing link (either a missing progenitor virus, or evolution of a progenitor virus in an intermediate host). Highly similar viruses have also been found in pangolins, suggesting cross-species transmission from bats at least once, but again with considerable genetic distance. Both these putative hosts are infrequently in contact with humans, and an intermediary step involving an amplifying host has been observed for several other emerging viruses (Henipaviruses, influenza viruses, SARS-CoV and MERS-CoV). SARS-CoV-2 infection and intraspecies spread (including further transmission to humans) has been documented in an increasing number of animal species, particularly mustelids and felids. SARS-CoV-2 adapts relatively rapidly in susceptible animals (such as mink). The increasing number of animals shown to be susceptible to SARS-CoV-2 includes animals that are farmed in sufficient densities to allow potential for enzootic circulation. High-density farming is common in many places across the world and includes many livestock species as well as farmed wildlife. There was a large network of domesticated wild animal farms, supplying farmed wildlife. In high-density farms, there often are connections between farms (for instance, through the workforce and food supply), leading to complex transmission pathways that may be difficult to unravel, as was observed in other zoonotic outbreaks involving farmed animals. Optimized conditions for sustained virus transmission chains in large-scale animal farms may also impact on virus seasonality in favour of a year-round endemic transmission pattern, and thereby increasing the zoonotic risk in winter months.

Arguments against

SARS-CoV-2 has been identified in an increasing number of animal species, but genetic and epidemiological studies have suggested that these were infections introduced from humans, rather than enzootic virus circulation. In addition, since the containment of SARS-CoV-2 in China, new outbreaks have occurred for which genomic sequence data was generated. Based on epidemiological analysis and genetic sequencing of viruses from new cases throughout 2020, there is no evidence of repeated introduction of early SARS-CoV-2 strains of potential animal origins into humans in China. There was no genetic or serological evidence for SARS-CoV-2 in a wide range of domestic and wild animals tested to date. The screening of the major livestock species was done across the country and provided no evidence for circulation of a related virus. The scale of testing in these species was such that widespread circulation is extremely unlikely. Screening of farmed wildlife was limited but did not provide conclusive evidence for the existence of circulation.

Assessment of likelihood

Based on the above arguments, the scenario including introduction through an intermediary host was considered to be likely to very likely.

I should note that it's often very difficult to figure out who's right and who's wrong in a scientific controversy but in general there's one group that appears to be thinking critically and one that's not. Critical thinking is also hard to recognize but when I was teaching it we emphasized one important clue. Critical thinkers usually present both sides of an argument and discuss not only their own opinions but also the views of the other side. That's one of the things that impress me about the WHO report. It doesn't mean that they are necessarily correct but they sure look a lot better than proponents of the lab leak conspiracy theory who seem to dismiss out of hand the possibility of a natural origin.

I'd also like to make note of the fact the WHO is not a perfect organization. They have made mistakes during this pandemic as have every single government on the planet (some more than others). I'm not defending everything that WHO has done but I don't see any reason to be overly suspicious of the integrity of the scientists who wrote this report.


Friday, May 07, 2021

More misinformation about junk DNA: this time it's in American Scientist

Emily Mortola and Manyuan Long have just published an article in American Scientist about Turning Junk into Us: How Genes Are Born. The article contains a lot of misinformaton about junk DNA that I'll discuss below.

Emily Mortola is a freelance science writer who worked with Manyuan Long when she was an undergraduate (I think). Manyuan Long is the Edna K. Papazian Distinguished Service Professor of Ecology and Evolution in the Department of Ecology and Evolution at the University of Chicago. His main research interest is the origin of new genes. It's reasonable to suspect that he's an expert on genome structure and evolution.

The article is behind a paywall so most of you can't see anything more than the opening paragraphs so let's look at those first. The second sentence is ...

As we discovered in 2003 with the conclusion of the Human Genome Project, a monumental 13-year-long research effort to sequence the entire human genome, approximately 98.8 percent of our DNA was categorized as junk.

This is not correct. The paper on the finished version of the human genome sequence was published in October 2004 (Finishing the euchromatic sequence of the human genome) and the authors reported that the coding exons of protein-coding genes covered about 1.2% of the genome. However, the authors also noted that there are many genes for tRNAs, ribosomal RNAs, snoRNAs, microRNAs, and probably other functional RNAs. Although they don't mention it, the authors must also have been aware of regulatory sequences, centromeres, telomeres, origins of replication and possibly other functional elements. They never said that all noncoding DNA (98.8%) was junk because that would be ridiculous. It's even more ridiculous to say it in 2021 [Stop Using the Term "Noncoding DNA:" It Doesn't Mean What You Think It Means].

The part of the article that you can see also lists a few "Quick Takes" and one of them is ...

Close to 99 percent of our genome has been historically classified as noncoding, useless “junk” DNA. Consequently, these sequences were rarely studied.

This is also incorrect as many scientists have pointed out repeatedly over the past fifty years or so. At no time in the past 50 years has any knowledgeable scientist ever claimed that all noncoding DNA is junk. I'm sorely tempted to accuse the authors of this article of lying because they really should know better, especially if they're writing an article about junk DNA in 2021. However, I reluctantly defer to Hanlon's razor.

Mortola and Long claim that mammalian genomes have between 85% to 99% junk DNA and wonder if it could have a function.

To most geneticists, the answer was that it has no function at all. The flow of genetic information—the central dogma of molecular biology—seems to leave no role for all of our intergenic sequences. In the classical view, a gene consists of a sequence of nucleotides of four possible types--adenine, cytosine, guanine, and thymine--represented by the letters A, C, G, and T. Three nucleotides in a row make up a codon, with each codon corresponding to a specific amino acid, or protein subunit, in the final protein product. In active genes, harmful mutations are weeded out by selection and beneficial ones are allowed to persist. But noncoding regions are not expressed in the form of a protein, so mutations in noncoding regions can be neither harmful nor beneficial. In other words, "junk" mutations cannot be steered by natural selection.

Those of you who have read this far will cringe when reading that. There are so many obvious errors in that paragraph that applying Hanlon's razor seems very complimentary. Imagine saying in the 21st centurey that the Central Dogma leaves no role at all for regulatory sequences or ribosomal RNA genes! But there's more; the authors double-down on their incorrect understanding of "gene" in order to fit their misunderstanding of the Central Dogma.

What Is a Gene, Really?

In our de novo gene studies in rice, to truly assess the potential significance of de novo genes, we relied on a strict definition of the word "gene" with which nearly every expert can agree. First, in order for a nucleotide sequence to be considered a true gene, an open reading frame (ORF) must be present. The ORF can be thought of as the "gene itself"; it begins with a starting mark common for every gene and ends with one of three possible finish line signals. One of the key enzymes in this process, the RNA polymerase, zips along the strand of DNA like a train on a monorail, transcribing it into its messenger RNA form. This point brings us to our second important criterion: A true gene is one that is both transcribed and translated. That is, a true gene is first used as a template to make transient messenger RNA, which is then translated into a protein.

Five Things You Should Know if You Want to Participate in the Junk DNA Debate

The authors admit in the next paragraph that some pseudogenes may produce functional RNAs that are never translated into proteins but they don't mention any other types of gene. I can understand why you might concentrate on protein-coding genes if you are studying de novo genes but why not just say that there are two types of genes and either one can arise de novo? But there's another problem with their definition: they left out a key property of a gene. It's not sufficient that a given stretch of DNA is transcribed and the RNA is translated to make a protein: the protein has to have a function before you can say that the stretch of DNA is a gene [What Is a Gene?]. We'll see in a minute why this is important.

The main point of the paper is the birth of de novo genes and the authors discuss their work with the rice genome. They say they've discovered 175 de novo genes but they don't say how many have a real biological function. This is an important problem in this field and it would have been fascinating to see a description of how they go about assigning a function to their, mostly small, pepides [The evolution of de novo genes]. I'm guessing that they just assume a function as soon as they recognize an open reading frame in a transcript.

As you can see from the title of the article, the emphasis is on the idea that de novo genes can arise from junk DNA—a concept that's not seriously disputed. The one good thing about the article is that the authors do not directly state that the reason for junk DNA is to give rise to new genes but this caption is troubling.

The Human Genome Project was a 13-year-long research effort aimed at mapping the entire human genetic sequence. One of its most intriguing findings was the observation that the number of protein-coding genes estimated to exist in humans--approximately 22,300--represents a mere 1.2 percent of our whole genome, with the other 98.8 percent being categorized as noncoding, useless junk. Analyses of this presumed junk DNA in diverse species are now revealing its role in the creation of genes.

Why do science writers continue to spread misinformation about junk DNA when there's so much correct information out there? All you have to do is look [More misconceptions about junk DNA - what are we doing wrong?].


World Health Organization (WHO) report on the lab leak conspiracy theory

There's been a lot of talk about the possibility that SAR-CoV-2 originated in the Wuhan Institute of Virology and accidentally escaped, causing the COVID-19 pandemic. There's no evidence that directly supports this possibility and plenty of evidence that casts serious doubt on the lab leak hypothesis. In order to discount the evidence against the hypothesis its supporters claim that scientists are lying and covering up the accidental release with the active cooperation of the Chinese government. Thus, an original scientific hypothesis has morphed into a full-blown conspiracy theory.

As with any conspiracy theory, there are all kinds of "facts" that have only been uncovered on twitter or Reddit but there are also speculations published by the Trump administration. It's very difficult verify or refute many of these "facts."

However, there's one fact that is widely misinterpreted and that's the report of the WHO scientists who visited the Wuhan Institute of Virology in order to investigate the lab leak hypothesis. They concluded that it was "extremely unlikely" so, as you might expect, the WHO scientists are now part of the conspiracy. Here's a copy of the section on the lab leak hypothesis from the WHO full report issued on March 30 2021 [WHO-convened global study of origins of SARS-CoV-2: China Part].

Introduction through a laboratory incident

Explanation of hypothesis

SARS-CoV-2 is introduced through a laboratory incident, reflecting an accidental infection of staff from laboratory activities involving the relevant viruses. We did not consider the hypothesis of deliberate release or deliberate bioengineering of SARS-CoV-2 for release, the latter has been ruled out by other scientists following analyses of the genome (3).

Arguments in favour

Although rare, laboratory accidents do happen, and different laboratories around the world are working with bat CoVs. When working in particular with virus cultures, but also with animal inoculations or clinical samples, humans could become infected in laboratories with limited biosafety, poor laboratory management practice, or following negligence. The closest known CoV RaTG13 strain (96.2%) to SARS-CoV-2 detected in bat anal swabs have been sequenced at the Wuhan Institute of Virology. The Wuhan CDC laboratory moved on 2nd December 2019 to a new location near the Huanan market. Such moves can be disruptive for the operations of any laboratory.

Arguments against

The closest relatives of SARS-CoV-2 from bats and pangolin are evolutionarily distant from SARS-CoV-2. There has been speculation regarding the presence of human ACE2 receptor binding and a furin-cleavage site in SARS-CoV-2, but both have been found in animal viruses as well, and elements of the furin-cleavage site are present in RmYN02 and the new Thailand bat SARSr-CoV. There is no record of viruses closely related to SARS-CoV-2 in any laboratory before December 2019, or genomes that in combination could provide a SARS-CoV-2 genome. Regarding accidental culture, prior to December 2019, there is no evidence of circulation of SARS-CoV-2 among people globally and the surveillance programme in place was limited regarding the number of samples processed and thereforethe risk of accidental culturing SARS-CoV-2 in the laboratory is extremely low. The three laboratories in Wuhan working with either CoVs diagnostics and/or CoVs isolation and vaccine development all had high quality biosafety level (BSL3 or 4) facilities that were well-managed, with a staff health monitoring programme with no reporting of COVID-19 compatible respiratory illness during the weeks/months prior to December 2019, and no serological evidence of infection in workers through SARS-CoV-2-specific serology-screening. The Wuhan CDC lab which moved on 2nd December 2019 reported no disruptions or incidents caused by the move. They also reported no storage nor laboratory activities on CoVs or other bat viruses preceding the outbreak.

Assessment of likelihood

In view of the above, a laboratory origin of the pandemic was considered to be extremely unlikely.

Please refer to the original report whenever you see the conspiracy theorists making claims about what WHO did or did not report. Those claims are not always accurate; for example, it is widely reported that WHO confirmed that there were COVID-19 case among lab workers in the autumn of 2019 but, as you can see, WHO refuted that part of the conspiracy theory.


Wednesday, May 05, 2021

Lab leak conspiracy theory rears its ugly head again: this time it's Nicholas Wade of the New York Times

Nicholas Wade used to be a serious science writer but he lost that title many years ago when he proved that he was incapable of distinguishing fact from wishful thinking [Nicholas Wade on the Origin of Life ]. Now he's gone completely bonkers by promoting the ridiculous conspiracy theory that the COVID-19 pandemic was started when the SARS-CoV-2 virus leaked from a lab at the Whuhan Institute of Virology (WIV) [Origin of Covid — Following the Clues].

Nicholas Wade claims that the virologists at the WIV, led by Dr. Shi, created the SARS-CoV-2 virus by genetic engineering. Their goal, according to Wade, was to make a virus that was as deadly to humans as possible in order to study its effects in the lab. Unfortunately, the virus escaped from the lab, according to Wade, and started the pandemic.

Shi Zhengli responded to those silly accusations in July 2020 [Wuhan coronavirus hunter Shi Zhengli speaks out].

On 15 July, Shi emailed Science answers to a series of questions about the virus' origin and her research. In them, she hit back at speculation that the virus leaked from WIV. She and her colleagues discovered the virus in late 2019, she says, in samples from patients who had a pneumonia of unknown origin. “Before that, we had never been in contact with or studied this virus, nor did we know of its existence,” Shi wrote.

“U.S. President Trump's claim that SARS-CoV-2 was leaked from our institute totally contradicts the facts,” she added. “It jeopardizes and affects our academic work and personal life. He owes us an apology.”

Why is this a conspiracy theory? Because the speculation has been investigated by WHO scientists who found no evidence to support it. They saw that the lab protocols at the Institute were very good, as you would expect for a world class lab that was studying dangerous viruses that were known to cause pandemics. Furthermore, none of the workers at the lab tested positive for COVID-19 and none of them were studying any virus that resembled SARS-CoV-19. So, in order for the lab leak hypothesis to be true there has to have been a massive coverup by a very large number of people. That's what makes it a conspriacy theory.

Nicholas Wade gets a lot of his information from Richard Ebright who has been promoting the lab leak conspiracy theory for the past year. Ebright thinks the WHO investigators "... were willing—and in at least one case, enthusiastic—participants in disinformation" [An Interview with Richard Ebright: The WHO Investigation Members Were “participants in disinformation”]. This is classic conspiracy theory stuff: everyone who disagrees with you is part of the conspiracy.

If you still think the lab leak conspiracy theory is true then I urge you to watch this video of a talk by Professor Edward ("Eddy") Holmes, the 2020 New South Wales (Australia) scientist of the year and an expert on human viruses, especially the coronoviruses [The Discovery and Origins of SARS-CoV-2]. He explains why the viruses are likely to orginate in bats and explains why this particular virus started off in bats but probably passed though an intermediate host before reaching humans. (His preferred intermediate host is racoon dogs and he explains why he thinks this is likely.) He explains why the sequence of the virus is entrely consistent with a natural origin. He describes his field work in China and Southeast Asia and his collaborations with the expert scientists in China, including those at the Wuhan Institute of Virology.

Holmes, addreses the conspiracy theory at 41:45 minutes into the talk so you can skip rght to there if you like—although I don't recommend it because there's lots of useful information in the first 40 minutes. Here's why he rejects that cosnspiracy theory and why you should too. These are the facts, according to Holmes. I agree with him.

  • There's "no evidence that SARS-CoV-2 is engineered (and no reason to bioengineer a random bat virus)." Holmes calls this idea is "absolute nonsense." I'm guessing he won't be a fan of Nicholas Wade's article.
  • "Bat virus RaTG13 is not the direct ancestor of SAR-CoV-2—all the components of the virus exist in nature."
  • "No evidence of a secret SARS-CoV-2-like virus kept at the WIV (and no reason to keep it a secret before the pandemic)." The scientists at WIV say that they were not studying such a virus and Holmes says, "Frankly, I believe them." Nicholas Wade thinks they are lying but offers no proof and no reason to justify the lie.
  • The SARS-CoV-2 virus is probably not directly from bats and WIV was only studying bat viruses. Furthermore, the virus is probably not from Yunnan province where the Wuhan Institute of Virology is located.
  • "SARS-CoV-2 was not perfectly adapted to humans on first emergence and appears to be a "generalist" virus." Nicholas Wade is wrong about this as well.
  • "Cases near WIV only appeared later in the outbreak." The first cases in Wuhan appear in the market, specifically in the area where live animals are sold. This strongly suggests that the virus came from animals in the market and that it originated in those animals somewhere else. There were cases in December 2019 that were not linked to the market but they were nowhere near the WIV.
  • "No evidence of SARS-CoV-2 infection at WIH—staff were PCR/antibody negative." Holmes says that if this is true then that rules out the lab leak hypothesis automatically. He's says that either this is the biggest coverup in history and they're all lying or there's no evidence at all that the virus was ever in the lab. He concludes that the virus did not come from the lab but he's sure that the conspiracy theory is not going to go away anytime soon.

Holmes is right. The conspiracy theory is not going away because its proponents think that all Chinese are evil and can't be trusted. Those conspiracy believers are wrong. Please don't spread this ridiculous idea; it makes you no better than QAnon cultists.

If you're really interested in the facts then there are several articles on the origin of SARS-CoV-2 that you should read before falling for the lab leak conspiracy thoery. Here's one.

MacLean, O.A., Lytras, S., Weaver, S., Singer, J.B., Boni, M.F., Lemey, P., Pond, S.L.K. and Robertson, D.L. (2021) Natural selection in the evolution of SARS-CoV-2 in bats created a generalist virus and highly capable human pathogen. PLoS Biology 19:e3001115. [doi: 10.1371/journal.pbio.3001115]

Virus host shifts are generally associated with novel adaptations to exploit the cells of the new host species optimally. Surprisingly, Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has apparently required little to no significant adaptation to humans since the start of the Coronavirus Disease 2019 (COVID-19) pandemic and to October 2020. Here we assess the types of natural selection taking place in Sarbecoviruses in horseshoe bats versus the early SARS-CoV-2 evolution in humans. While there is moderate evidence of diversifying positive selection in SARS-CoV-2 in humans, it is limited to the early phase of the pandemic, and purifying selection is much weaker in SARS-CoV-2 than in related bat Sarbecoviruses. In contrast, our analysis detects evidence for significant positive episodic diversifying selection acting at the base of the bat virus lineage SARS-CoV-2 emerged from, accompanied by an adaptive depletion in CpG composition presumed to be linked to the action of antiviral mechanisms in these ancestral bat hosts. The closest bat virus to SARS-CoV-2, RmYN02 (sharing an ancestor about 1976), is a recombinant with a structure that includes differential CpG content in Spike; clear evidence of coinfection and evolution in bats without involvement of other species. While an undiscovered “facilitating” intermediate species cannot be discounted, collectively, our results support the progenitor of SARS-CoV-2 being capable of efficient human–human transmission as a consequence of its adaptive evolutionary history in bats, not humans, which created a relatively generalist virus.


Monday, May 03, 2021

More illusions/delusions of James Shapiro and Denis Noble

It was just a few weeks ago that I discussed short articles by Denis Noble and James Shapiro that were published in the journal Biosemiotics [The illusions of Denis Noble] [The illusions of James Shapiro].

Several readers questioned whether Biosemiotics is a real science journal and they were right: it's a kooky journal and that's why it publishes papers by kooks. However, we now have a new paper by Shapiro and Noble that's about to appear in a legitimate scientific journal; albeit, one that has seen better days. This would normally raise red flags concerning peer review but we're long past the time when we can count on peer review to weed out the kooks.

Here's the paper. I'm not going to discuss all the main points because they were covered in my previous posts. I'll just concentrate on the most ridiculous part in order to illustrate the (lack of) quality of this paper.1

Shapiro, J. and Noble, D. (2021) What prevents mainstream evolutionists teaching the whole truth about how genomes evolve? Progress in Biophysics and Molecular Biology. [doi: 10.1016/j.pbiomolbio.2021.04.004]

The common belief that the neo-Darwinian Modern Synthesis (MS) was buttressed by the discoveries of molecular biology is incorrect. On the contrary those discoveries have undermined the MS. This article discusses the many processes revealed by molecular studies and genome sequencing that contribute to evolution but nonetheless lie beyond the strict confines of the MS formulated in the 1940s. The core assumptions of the MS that molecular studies have discredited include the idea that DNA is intrinsically a faithful self-replicator, the one-way transfer of heritable information from nucleic acids to other cell molecules, the myth of “selfish DNA,” and the existence of an impenetrable Weismann Barrier separating somatic and germ line cells. Processes fundamental to modern evolutionary theory include symbiogenesis, biosphere interactions between distant taxa (including viruses), horizontal DNA transfers, natural genetic engineering, organismal stress responses that activate intrinsic genome change operators, and macroevolution by genome restructuring (distinct from the gradual accumulation of local microevolutionary changes in the MS). These 21st Century concepts treat the evolving genome as a highly formatted and integrated Read-Write (RW) database rather than a Read-Only Memory (ROM) collection of independent gene units that change by random copying errors. Most of the discoverers of these macroevolutionary processes have been ignored in mainstream textbooks and popularizations of evolutionary biology, as we document in some detail. Ironically, we show that the active view of evolution that emerges from genomics and molecular biology is much closer to the 19th century ideas of both Darwin and Lamarck. The capacity of cells to activate evolutionary genome change under stress can account for some of the most negative clinical results in oncology, especially the sudden appearance of treatment-resistant and more aggressive tumors following therapies intended to eradicate all cancer cells. Knowing that extreme stress can be a trigger for punctuated macroevolutionary change suggests that less lethal therapies may result in longer survival times.

The section on "selfish DNA" is the one that seems to have the highest number of misleading and false statements per paragraph.

1.4. The end of “selfish” or “junk” DNA

A major shortcoming of the MS is that it was based on a “gene-centric” view, which assumed that the genome is basically a collection of “genes” that are the protein-coding units of heredity and heritable variation. As we saw in the quotation from Goldschmidt's 1940 book, this view failed to take the evolutionary importance of chromosome structure into account (Goldschmidt, 1940). It also blinded evolutionary biologists to the importance of McClintock's mid- 20th Century discovery of mobile “controlling elements” (McClintock, 1987). Both the ideas of genetic transposition and control of gene expression by these non-coding mobile elements did not fit within the narrow confines of the MS concepts of genome function and variation. A further empirical assault on the limited MS conceptual framework came in the late 1960s when Britten and Kohne discovered that a significant fraction of genomic DNA from complex eukaryotes consists of highly repetitive sequences rather than the unique coding sequences expected to make up the hereditary material (Britten and Kohne, 1968).

  • The title is ridiculous since no respectable scientist ever equated selfish DNA with junk DNA [Selfish genes and transposons].

  • The Modern Synthesis (MS) was not based on a "gene-centric" view.
  • For the past 50 years, no respectable scientist, and no knowledgeable expert in molecular evolution, has restricted the definition of "gene" to just protein-coding genes.
  • For the past 50 years, no expert in molecular evolution has ever thought that the genome is just a collection of protein-coding genes.
  • For the past 50 years, experts in molecular biology have known about transposons and have considered the view that some of them might be "controlling elements." They have concluded that most transposon-related sequences are just fragments of defective transposons with no biological function.
  • Nobody cares whether mobile genetic elements fit within the narrow confines of the Modern Synthesis as described by Huxley and other in the 1940s because no exeprt in molecular evolution has believed in that view of evolution since the late 1960s.
  • The Britten and Kohne paper established that the genomes of most multicellular eukaryotes contain large amounts of repetivie DNA. This was an attempt to resolve the C-value paradox. Britten and Kohne didn't like the idea that this could be junk DNA so they offered some speculation about function. However, futher data established that most of this repetitive DNA is, indeed, junk and Britten and Kohn's speculations have been discredited. Britten and Kohn were attempting to interpret their result within the context of the adaptationist views that characterized the the Modern Synthesis back then. The correct interpretation of their results came with the overthrow of the Modern Synthesis and the adoption of a new view of evolutionary theory that focused on Neutral Theory, Nearly-Neural Theory, and the importance of random geneitc drift. Shaprio and Noble missed that revolution so they continue to attack an old-fashioned strawman version of evolutionay theory.

Before continuing, it's important to realize that by the early 1970s selectionist thinking had been abandoned by the experts in genome evolution. By 1978 Gould and Lewontin tried, unsccessfully, to convince all other biologists to abandon the old selectionist way of thinking [The Spandrels of San Marco and the Panglossian Paradigm]. James Shapiro and Denis Noble are among those other biologists who didn't get the message.

In order to apply selectionist thinking to explain the presence of so much non-coding DNA, evolutionary biologists called this unexpected portion of the genome “junk DNA” (Ohno, 1972) or “selfish DNA” (Orgel and Crick, 1980). Richard Dawkins used an extreme view of these “selfish genes” to erect a whole philosophy of strictly passive evolutionary gradualism (Dawkins, 1976). Today we know that the human genome contains at least 30X as much repetitive non-coding DNA as protein-coding sequences (Lander et al., 2001). Repetitive DNA provides formatting signals for transcription, epigenetic modification and chromosome mechanics and also is the most variable component in the evolutionary diversification of complex genomes (Symonová and Howell, 2018; Subirana et al., 2015; Matsubara et al., 2016; CioffiMde et al., 2015; Chalopin et al., 2015; Shao et al., 2019; Böhne et al., 2008; Li et al., 2016; Oliver et al., 2013). A 2013 plot of organismal complexity against protein-coding and non-coding DNA showed that coding DNA peaked at approximately ∼3 × 107 bp, while the non-coding DNA increased linearly with growing complexity up to ∼2–3 x 1010 bp (Liu et al., 2013). In other words, non-coding DNA tracked organismal complexity better than the protein-coding genes. The “encyclopedia of DNA elements” (ENCODE) project, which largely abandoned the term “gene,” revealed that the large majority of the so-called junk DNA is actively transcribed in a regulated manner, indicating that it is functional (Consortium, 2012; Pennisi, 2012).

  • It is completely, totally, ridiculous to say that the idea of junk DNA was due to selectionist thinking. The first statement in this paragraph is powerful evidence that Shaprio and Noble don't know what they are talking about. The concept of junk DNA is a rejection of selectionist thinking.
  • The use of "noncoding DNA" is what's called a "tell."
  • Again, equating junk DNA with selfish DNA is stupid. If all the excess DNA were selfish then it isn't junk because it has a function.
  • Richard Dawkins' view on evolution is closer to the old-fashioned adaptationist view that was abandoned by the experts by the time he wrote The Selfish Gene. Dawkins book is not really about "genes," however, as is clear to anyone who has read it. He's talking about any piece of DNA that confers a fitness advantage. The Dawkins strawman is a favorite target of the Third Way types but it's just a strawman.
  • No significant proportion of repetitive DNA has a function in spite of the references quoted above.
  • There is no significant correlation between organismal compexity and noncoding DNA. Lots of very similar species, such as onions, have very different genome sizes.
  • No knowledgeable scientist since the 1980s thinks there should be a significant correlation between the number of genes and organismal complexity. We know that most of the phenotypic differences between multicellular species are due to changes in the timing and amount of expression of a standard set of genes. This is the main discovery of evolutionary-developmental biology (evo-devo), another revolution that Shapiro and Nobel missed. They should educate themselves by reading Sean B. Carroll's books.
  • The ENCODE researchers did lots of silly things but they did NOT abandon the term "gene."
  • The idea that most of our genome is functional because of ENCODE is laughable in 2021. The fact that Shapiro and Noble would bring this up is another "tell" and the fact that they would reference Elizabeth Pennisi is even more revealing. These guys are incapable of thinking critically.

Shaprio and Noble then describe a few examples of repetitive DNA sequences that have a known function and they point out that a number of noncoding genes have been indentified. They imply that these functional sequences make up a signifcant fraction of the genome thus calling the concept of junk DNA into question. They close the section with,

Clearly, none of the eminent scientists who wrote about junk or selfish DNA could possibly have imagined the wide range of cellular functionalities that we know today are executed by ncRNA molecules. The idea that a genome was just a collection of protein coding sequences has proved completely inadequate.

  • I don't know about you, dear reader, but I'll match those "eminent scientists" against Shapiro and Noble any day. I'd love to see them try to defend their views in a public debate against some of the leading proponents of junk DNA. I know where my money would be.

Let me close by quoting the last chapter of this paper. I don't intend to comment on it except to say that it gives new meaning to the word "irony."

The campaign to sustain the Modern Synthesis causes real harm in a number of different ways. Among doctors treating bacterial infections, ignorance of real-world evolutionary processes has led to a situation in which the available antibiotics have lost their effectiveness against many life-threatening conditions (CDC et al., 2019). Among the general public, the inability to comprehend the potential all living organisms possess for transferring and reorganizing genomic configurations makes them unprepared to form sound judgements about how society should utilize its growing arsenal of biotechnology tools acquired from our microbial neighbors, like CRISPR (Doudna, 2020). Among oncologists, MS thinking prevents the practitioners treating cancer patients from recognizing the dangers of overtreating tolerable tumors in ways that may provoke a macroevolutionary transition to a far more lethal and untreatable disease (Heng, 2019). Finally, in the battle against obscurantism and anti-evolution prejudice, insistence on an outdated set of assertions about how life can change itself leaves the defenders of rigorous scientific inquiry without satisfactory responses to critics. Clearly, the time has come for the mainstream evolution community to recognize and join the scientific reality of the 21st Century.

Finally, one of the most important properties of kooks is that they find each other and they tend to hang out together, either physically or virtually. I'm not sure why this happens since they often espouse mutually exclusive views. I'm guessing that we can explain it in two different ways: (1) they are all outsiders fighting against a common enemy; namely, real science, and (2) they lack critical thinking skills so they don't see the flaws in each other's arguments.


1. In case you didn't recognize the quality from the title.

Thursday, April 29, 2021

Chromatin organization at promoters in yeast cells

Our genome is very large and very complicated because it is full of junk DNA. It contains thousand of sites where DNA binding proteins can bind just by chance. This leads to the reorganization of nucleosomes in a way that mimics functional sites. It's difficult to distinguish these spurious sites from real functional sites and that has led to much confusion in the scientific literature.1

The yeast genome is much more simple and it's safe to assume that almost all of the sites detected by the standard chromatin assays are genuine, biologically relevant, sites. In that sense, it serves as a model for what functional sites looks like. A recent paper in Nature (April 8, 2021) reports on the mapping of most of the sites in the yeast genome where DNA binding proteins are found.

Rossi, M.J., Kuntala, P.K., Lai, W.K., Yamada, N., Badjatia, N., Mittal, C., Kuzu, G., Bocklund, K., Farrell, N.P., Blanda, T.R.M., Joshua D, V, B.A., Mistretta, K.S., Rocco, D.J., Perkinson, E.S., Kellogg, G.D., Mahony, S. and Pugh, B.F. (2021) A high-resolution protein architecture of the budding yeast genome. Nature 592:309-314. [doi: 10.1038/s41586-021-03314-8]

Origins of replication

Origins of replication are also called autonomously replicating sequence consensus sequences (ACS). There are 253 of them in the yeast genome and they are characterized by a 300 bp nucloeosome-free region that's occupied by the origin recognition complex (ORC) and the helicase MCM.

Telomeres

Telomeres are bound by a number of proteins including silent information regulators (SIRs). There's a nucleosome-free region of about 300 bp. where these proteins are located.

Centromeres

The nucleosome-free region at centromeres covers only 170 bp where a number of centromere binding proteins are located. The absence of nucleosomes at the centromere is a surprise since it was though that centromere DNA was bound by modified nucleosomes containing a specific histone variant.