Lots of people seem to be confused about the origin of SARS-CoV-2, the virus responsible for the COVID-19 pandemic. The investigating committee ot the World Health Organization (WHO) concluded last winter that a natural origin is the most likely scenario but there still seems to be a substantial percentage of the population who think that the virus was being studied at the Wuhan Institute of Virology (WIV) and leaked from there to start the pandemic. This belief in a lab leak scenario persists in spite ot the fact that 21 expert scientists have discounted it and concluded that a natural origin is the best hypothesis. The lab leak speculation persists even when the United States intelligence agencies reached the same conclusion as the scientific experts and said that a natural origin was more likely.
I've published several posts on this topic over the past year trying to emphasize four points: (1) the evidence strongly favors a natural origin, (2) there is no scientific evidence to support the claim that the WIV scientists were working on SARS-CoV-2 before the pandemic started, (3) the most knowledgeable science experts agree that a natural origin is the most likely scenario, and (4) the media is misrepresenting the science and treating the two competing explanations as equivalent.
In this post I want to describe the case for a natural origin in as simple a manner as possible so that people can refer to the main lines of evidence and so that opponents of a natural origin can explain why they dismiss that evidence. I also want to briefly explain why we need to listen to the experts instead of arbitrarily dismissing their views as a argument from authority and assuming that our own research trumps the experts.
1. Other infectious coronaviruses have a natural origin
There are seven previously known coronaviruses that infect humans and all of them have a natural origin in either bats or mice. Five of them were most likely transmitted to humans through another animal such as camels, pigs, or cows. This does not prove that SARS-CoV-2 had a natural origin but it does lend strong support to the idea and it make us skeptical of claims that there's something special about SARS-CoV-2 that requires a different explanation.
Most of the first infections in Wuhan were in people who frequented the wet markets suggesting that, at the very least, the market areas were the location of a superspreader event. There is no other location that seems to have been the source of a significant number of early infections. Of the three earliest cases, two were directly linked to the Huanan wet market. SARS-CoV-2 was detected in samples from the Huanan wet market itself although none of the animals that have been tested were positive for SARS-CoV-2.
There were two different SARS-CoV-2 lineages (A and B) in the early cases suggesting two independant events. Lineage B was found in the Huanan market and lineage A is associated with another market and with cases in other parts of China. Lineage B quickly became the dominant lineage worldwide. The data is consistent with a natural origin and spread from animals that were sold in these markets.
3. Sequence data
The first sequence of SARS-CoV-2 was published in early January 2020 and hundreds of other sequences, including all the main variants, have been published since then. Dozens of sequences of related bat viruses have been published including some of the closest relatives that have only been discovered in the past year such as BANAL-52 from bat caves in Laos. The standard reference sequence is 29,903 bp long and it differs from the related bat virus sequences at more than 900 sites.
By looking at the relationship between the full sequences, it is possible to construct phylogenetic trees showing the probable evolution of SARS-CoV-2. The data shows that SARS-CoV-2 clusters with a small group of bat viruses called the betacoronaviruses. This group includes a group of virus sequences recently identified in pangolins and it's quite distinct from the group that includes the original SARS virus that caused the 2002 outbreak.
In terms of the overall backbone sequence, one of the viruses, BANAL-52, isolated just a few months ago, is the closest match to SARS-CoV-2. It is much more closely related than RaTG13, the bat virus isolated from caves in China and sequenced by scientists at WIV. As a recent NIH reports notes,
Although RaTG13 and BANAL-52 are 96-97% identical to SARS-CoV-2 at the nucleotide level (>900 nucleotide differences across the entire genome), the difference actually represents decades of evolutionary divergence from SARS-CoV-2. Experts in evolutionary biology and virology have made it clear that even the closest known relatives of SARS-CoV-2 ... are evolutionarily too distant from SARS-CoV-2 to have been the progenitor of the COVID-19 pandemic (ref, ref). Field studies continue the search for more proximate progenitors.
The data support a natural origin of SARS-CoV-2 from an unknown precursor that may have arisen as the result of a recombination event between two bat viruses. This pattern of evolution is quite common among bat coronaviruses and there's nothing in the sequence data to suggest an unnatural origin.
This is an important point. The sequence of SARS-CoV-2 has all the characteristics of a naturally evolving virus that has shared the same evolutionary history as its closest bat relatives. The data is completely consistent with evolution in a bat host for the past several decades (Deng et al., 2021). This is a pattern based on over 900 mutations that are not present in any of the other bat sequences that have been sequenced. Anyone promoting a non-natural origin has to account for this data.
Here are some links to the life cycle of coronaviruses and the SARS-CoV-2 sequence for those who want more information.
- The coronavirus life cycle
- Structure and expression of the SARS-CoV-2 (coronavirus) genome
- The SARS-CoV-2 reference genome
The argument from authority
Very few of us have direct experience in coronavirus evolution. There are hundreds of publications on this topic and many of the studies are well above our pay grade. We have no choice but to trust the word of others with more experience.
Here's the problem. Some scientific publications promote a natural origin of SARS-CoV-2 while others attempt to cast doubt on that scenario. On the surface, it looks like the experts disagree so there's no obvious consensus but appearances can be deceiving. There are far more experts supporting a natural origin but the minority promoting a lab leak hypothesis are getting a lot more press. This leads to a very misleading impression that's often echoed on TV by people like medical doctors who don't have a good grasp of how science works.
The fact that scientists disagree is not unusual.
I used to teach a course on critical thinking and one of the most important lessons was on how to tell which scientists (or science journalists) are likely to be correct. There are several important clues you can use when reading scientific papers or media reports.
If the scientists are repeating statments that have aready been refuted then you are right to question their credibility. It suggests that they either have an unscientific agenda or they aren't on top of their subject. For example, if a scientist says that SARS-CoV-2 was highly adapted to humans when it first appeared then you should be skeptical because, in fact, SARS-CoV-2 is a generalist virus that can propagate in a wide variety of mammalian species. The repetition of previously refuted claims is a key indicator of low credibility.
This also applies to unproven claims that are repeated without mentioning that the claim is disputed. A good example in this debate is the claim that several WIV lab workers were hospitalized with pneumonia-like symptoms in November 2019. So far there is no evidence to back up this claim so a good science writer would mention that.
- If a claim is only referred to indirectly without fleshing out the actual scenario then you should be cautious. For example, if a scientist claims that SARS-CoV-2 was created in a lab but doesn't spell out how this might have been done other than hand-waving statments about furin cleavage sites then you have to ask yourself why they don't produce a specific scenario that can be examined. It usually means that the writer has not thought through their claim and has not considerd the implications. Those are bad signs.
- Another bad sign is when a claim is associated with discrediting other scientists rather than just disagreeing with them. There's nothing about the natural origin explanation that requires you to believe that any scientists are lying or trying to mispresent the evidence. However, in order to believe the lab leak explanation you have to also believe that the scientists at WIV are lying about never working with SARS-CoV-2 before the pandemic and that they are now conspiring to cover up their involvement with starting the pandemic. That's possible but, as a general rule, whenever a claim has to invoke a conspiracy in order to believe it then you should see red flags.
A favorite trick of the anti-science crowd is to promote irrelevant information dressed up as though it were meaningful. The obvious example in this controversy is the fact that the WIV took down their database of preliminary viral sequence data in September 2019. The data is still available in various caches and most of it was published. We know that there was nothing in that database that pointed to anything but a natural origin of SARS-CoV-2 but that doesn't stop the anti-science crowd from trying to make it look suspicious.
Sanjay Gupta, a medical doctor not a scientist, on CNN gave us a good example of this in his recent special on the origin of SARS-CoV-2 when he repeatedly brought up this topic without explaining why it was relevant other than to raise suspicions among the uninformed.
Another exmple of this type of red-herring tactic is the attack on the funding of EcoHealth Alliance by the NIH. The anti-science writers and politicans with an agenda are trying to make it sound as though NIH was funding gain-of-function research at the Wuhan Institute of Virology. It wasn't. The NIH was funding a pefectly normal investigation into the evolution of coronavises in China in an effort to predict, and possibly prevent, a pandemic. The anti-science crowd cleverly raises doubt about the motives of NIH as part of its conspiracy theory and it cleverly doesn't mention the lack of any evidence between what they are suggesting and the origin of SARS-CoV-2. That's another bad sign that critical thinkers will take note of when evaluating credibility.
One of the key indicators of reliable authorities is when they present both sides of an argument. For example, if you read a paper that discusses the evidence for a natural origin of SARS-CoV-2 and then goes on to explain the lab leak scenario and why it is less credible then that's a good sign. On the other hand, if a paper only discusses their own baised opinion without mentioning and refuting the contrary evidence then that's a bad sign. I don't know of a single article that promotes the lab leak scenario and also attempts to refute the evidence of a natural origin. I find that troubling because it's typical of many other disputes where that kind of behavior is associated with the losing side. This is one of the problems that Richard Feynman identifies with "cargo-cult science."
Details that could throw doubt on your interpretation must be given, if you know them. You must do the best you can — if you know anything at all wrong, or possibly wrong — to explain it. If you make a theory, for example, and advertise it, or put it out, then you must also put down all the facts that disagree with it, as well as those that agree with it.
Richard Feynman (1985)
The lab leak conspiracy theory has all of the characteristics of other anti-science attacks such as the creationists' attack on evolution and those who deny climate change. We should stop giving them publicity and we should not hesitate to point out their lack of knowledge.
- Most scientists dismiss the lab leak conspiracy theory
- 21 experts support a natural origin for SARS-CoV-2
- Is the media finally realizing that they have been duped into promoting the lab leak conspiracy theory?
- Let's analyze the Newsweek lab leak conspiracy theory article
- Real scientists discuss the lab leak conspiracy theory
- World Health Organization (WHO) report on the natural origin theory of SARS-CoV-2
- World Health Organization (WHO) report on the lab leak conspiracy theory
- Lab leak conspiracy theory rears its ugly head again: this time it's Nicholas Wade of the New York Times
Image Credits: The coronavirus figure is from Alexy Solodovnikov and Wikmedia Commons. The figure illustrating the origins of of coronavirus infections is from a review in Nature Reviews: Microbiology by Zheng-Li Shi of the Wuhan Institute of virology (Cui and Shi, 2019). The review was first published in December 2018 and it's where she warns of future pandemics caused by bat coronaviruses. The figure showing the location of the earliest infections in Wuhan is from Holmes et al. (2021). The phylogentic tree is from Zhou et al. (2021).
Cui, J., Li, F. and Shi, Z.-L. (2019) Origin and evolution of pathogenic coronaviruses. Nature Reviews Microbiology 17:181-192. [doi: 10.1038/s41579-018-0118-9]
Deng, S., Xing, K. and He, X. (2021) Mutation signatures inform the natural host of SARS-CoV-2. bioRxiv. [doi: 10.1101/2021.07.05.451089]
Holmes, E.C., Goldstein, S.A., Rasmussen, A.L., Robertson, D.L., Crits-Christoph, A., Wertheim, J.O., Anthony, S.J., Barclay, W.S., Boni, M.F., Doherty, P.C., Farrar, J., Geoghegan, J.L., Jiang, X., Leibowitz, J.L., Neil, S.J.D., Skern, T., Weiss, S., R, Worobey, M., Anderson, K.G., Garry, R.F. and Rambaut, A. (2021) The origins of SARS-CoV-2: A critical review. Cell 184:4848-4856. [doi: 10.1016/j.cell.2021.08.017]
Zhou, H., Ji, J., Chen, X., Bi, Y., Li, J., Wang, Q., Hu, T., Song, H., Zhao, R., Chen, Y., Cui, M., Zhang, Y., Hughes, A.C., Holmes, E.C. and Shi, W. (2021) Identification of novel bat coronaviruses sheds light on the evolutionary origins of SARS-CoV-2 and related viruses. Cell 184:4380-4391. [doi: 10.1101/2021.03.08.434390]>