The novel coronavirus, SARS-CoV-2, has infected at least 3 million people worldwide and 1 million people in the United States alone. The debate surrounding the origin and source of the virus has heated up with many accusing the opposite side of rejecting scientific evidence. It is more important now than ever to understand the difference between scientific skepticism and a conspiracy theory.
The term conspiracy theory is often used to suggest that an explanation is an implausible hypothesis or is anti-scientific. Yet the existence of bad actors or a cover-up in a hypothesis isn’t enough to constitute a conspiracy theory. Take the Iranian nuclear program as an example. Is it a conspiracy theory to consider that the Iranian pursuit of nuclear energy or uranium enrichment is motivated by nuclear weapon capability ambitions?
It’s a fact that there are bad actors in the world—and that individuals and states alike often lie about their actions and aims in order to advance what they understand to be their own self-interest. Take India’s so-called peaceful nuclear explosion in 1974 as an example. India took advantage of the Atoms for Peace program and used the information provided by the United States and Canada as well as the CIRUS research reactor to develop its nuclear weapons program. Even more interestingly, they declared that their nuclear weapons were intended for civil applications such as large-scale excavation and not intended for military use. It is a fact that India now possesses a large arsenal of nuclear weapons.
It’s no secret to anyone—and therefore not a conspiracy theory—that communism and other forms of totalitarian rule are built on a culture of secrecy. Communism necessitates a strong central government, and for a central government to maintain strong control over a country, it’s necessary for them to control information flow into, within, and out of the country. This involves both direct and indirect censorship of the media and internet—and often, more importantly, tight administrative controls that govern the transfer of information within the country. A good example from recent history is the Chernobyl nuclear disaster of the Soviet Union. Delays in reporting the initial nuclear explosion caused many fatalities that could have been avoided had the Soviets acted and evacuated early, which in turn motivated a Soviet attempt to cover up the true extent of the disaster to maintain a strong government image.
Cover-ups don’t generally involve evil conspirators who try to hide some important truth with the primary aim of injuring large numbers of people. They often follow naturally from the structure and functioning of state bureaucracies—or they can be rational if arguably selfish means of pursuing what a group of people understands to be matters of national self-interest, like ensuring adequate supplies of medicine and protective gear for one’s own citizens. Therefore, it’s not a conspiracy theory to consider the possibility of a cover-up relating to the origin and the source of COVID-19 in Wuhan, Hubei, China. In fact, there is good evidence of Chinese cover-ups from the beginning of the pandemic.
There are certain elements that are usually present in conspiracy theories that are not present in a sound scientific hypothesis. Conspiracy theories often involve lack of physical connections. A good example of this is the 5G conspiracy theory in its different forms. Basic elementary school science education is enough to refute such a “theory,” which it is painful to even call a theory. A conspiracy theory may also be a bad and malicious hypothesis that is promoted despite available, reliable data directly proving that it can’t be true. One example of this is the false claim that SARS-CoV-2 was engineered to selectively infect non-Asians. Simple inspection of infection demographics in the United States refutes this hypothesis. While there are some differences in human ACE2 receptors among different races, the differences are not strong enough to provide immunity to a particular race or group especially as coronaviruses rapidly adapt and evolve into new strains.
It is important to clarify, however, that not every false hypothesis is a conspiracy theory. For instance, some researchers pointed out the presence of HIV-like segments in the SARS-CoV-2 genome and claimed, based on an incomplete investigation, that it is evidence of intentional manipulation. The presence of HIV-like segments is an observation that is clearly explained by natural acquisition of those segments in a manner similar to that in related naturally occurring bat coronaviruses such as ZC45 and ZXC21, which contain similar segments.
Conversely, there are elements that are present in a sound scientific hypothesis that are not present in a conspiracy theory. One such element is justification. Scientists can’t investigate every idea or hypothesis. Justifying a hypothesis is one of the most tedious steps in research. This process involves gathering evidence, demonstrating that the hypothesis is plausible, and clearly explaining the need for the work in the context of the ongoing scientific conversation. Assessing the plausibility of a particular hypothesis is important to justify investigating it. This, however, must be done in context of the effect in question. A stronger effect would justify investigating even less plausible hypotheses. On the other hand, justifying the need for the work can be as simple as explaining gaps in the knowledge or finding discrepancies and loopholes in published work that are significant enough to affect the conclusions.
The hypothesis that SARS-CoV-2 leaked out of a laboratory is, by scientific standards, a sound and a well-justified hypothesis. Media sources that claim to refute the lab source hypothesis often refer to the public comments of zoologist Peter Daszak, the flawed correspondence of Andersen et al., or the emotional Lancet letter in which some scientists basically expressed their support and compassion with their Chinese peers. While there are some virus hunters like Peter Daszak who assert zoonotic transfer and discount the possibility of a lab leak, there are also leading microbiologists like professor Richard Ebright who assert that a lab or lab-related accident is a possible cause of the outbreak.
Notably, virus ecologists like Peter Daszak and Jonna Mazet have an inherent conflict of interest as they are involved in similar bat and wildlife sampling activity—and, in Daszak and Mazet’s case, in research with the Wuhan labs. As an example of such activity, Daszak and collaborators sampled 12,333 bats for viruses in a big wildlife surveillance project. A lab-related accident in China involving similar research would likely affect the funding for their work as it would demonstrate the risks involved. As it happens, the NIH recently cut the funding to Daszak’s EcoHealth Alliance after realizing the risks involved in that research.
Daszak’s relentless and heavily amplified public assertions that the outbreak must have originated due to a zoonotic jump, and his denial of the possibility of a lab accident involving a natural virus, even long before the SARS-CoV-2 genome was published, would appear to be motivated by the apparent conflict of interest that he has denied. Daszak’s denial of his conflict of interest raised concerns of many scientists and experts, with many explicitly describing that denial as a bold lie. Daszak has presented no direct evidence that the outbreak started as a result of a zoonotic jump outside of a laboratory. In case the outbreak is a result of a natural zoonotic jump, that would underscore the importance of Daszak’s risky wildlife sampling and “early outbreak warning” work and increase their research funding. It is important to consider conflicts of interest when assessing anyone’s claims.
Daszak’s main argument is that the majority of viruses evolve in nature and some may be transmitted to humans through natural animal contact that is frequent in Southeast Asia. This argument, however, is meaningless unless we are trying to blindly throw bets without looking at any other factors. Daszak’s argument would be a very poor and mathematically flawed reason to call off investigations on the origin and source of the virus. Facts at the population level don’t make SARS-CoV-2 in particular any likelier to be natural in its origin or transmission source.
To illustrate this with a simple mathematical example, suppose that we know from established statistics that an overwhelming 80% of the people in a particular small town are doctors. You enter a fish market in that town and see someone selling fish. Is it reasonable to say that there is an 80% probability that he is also a doctor? While there is a very small chance now that this person is also a doctor, we would need to look at the probability that someone in the town is both a doctor and a fishmonger if we wish to throw bets. If we wish to find out for certain, we could follow him, and research his background, and see if he is a doctor.
Data and statistics are useful at the population level but not at the individual level, as that information could be obtained by direct measurement. At the individual level, population statistics translate into a probability if we blindly pick a random individual. If the individual isn’t really random, i.e., if we know some other information about them, the statistics we have on the population as a whole break down and become meaningless.
Given that the 96.2% sequence match of bat RaTG13 and human SARS-CoV-2 is not enough to rule out even a chimeric origin, Andersen et al. analyzed the mutations in the receptor-binding domain (RBD) of SARS-CoV-2 and compared features of its spike protein with that of bat RaTG13, pangolin coronavirus, human SARS-CoV, and two bat SARS-like coronaviruses. They highlighted two notable features in SARS-CoV-2, particularly the optimized binding of the spike protein of SARS CoV-2 to human ACE2 receptor and the existence of a functional polybasic site at the two subunits of the spike of nonobvious function that’s likely a result of natural mutations. Their analysis of the mutations showed that the so called RaTG13 couldn’t have been the backbone of SARS-CoV-2 had it been chimeric, with many unverified assumptions.
However, after their brief and informative scientific endeavor, the authors then presented flawed arguments on the nature and source of the virus and conclusions that only reflect their beliefs and opinion. The approach they used to reach their conclusions is not sound for verification purposes, as it relies fundamentally on faith and trust. While trust is usual and healthy in academia, it’s not suitable for verification of lab accidents involving large-scale damage or potential WMD/dual use activity backed by a state.
First, Andersen et al. don’t conduct independent sequencing of bat RaTG13 samples which were sampled in 2013 but only sequenced and uploaded to GenBank in 2020. Therefore, Andersen’s analysis is just an extension of the published work of Zhou et al. from the Wuhan Institute of Virology, which is one alleged source of a possible leak of the virus. Second, they assume that published information from a lab where a source is suspected is complete, and they don’t verify that bat RaTG13 is, indeed, the closest relative of human SARS-CoV-2 encountered by or known to the two labs where the origin or source is suspected.
The conclusions of Andersen et al. on the nature of the virus almost all hinge on the assumption that they know all backbone viruses studied at the Wuhan lab, which reflects circular reasoning, given their sources and assumptions. The closest known virus to human SARS-CoV-2 and bat RaTG13 is bat BtCoV/4991—but only a partial sequence for the RdRp gene of BtCoV/4991 was uploaded to GenBank in 2016. It’s unclear if BtCoV/4991 is RaTG13 itself or a closer progenitor of SARS-CoV-2, because only a partial sequence was uploaded and BtCoV/4991 wasn’t referenced by Zhou et al. It’s unclear why it would be renamed.
Third, as professor Richard Ebright had pointed out, the authors dismiss the possibility that bat RaTG13 is a proximate progenitor of SARS-CoV-2 based on unverified assumptions on the evolutionary rates and about the possibility of passage in cell culture or animal models. While Andersen et al. do briefly acknowledge the possibility of passage in cell culture, they go on to assumptively conclude that the virus is natural in both origin and source when in fact a closely related bat coronavirus could have adapted to human cells in cell culture experiments.
Fourth, Andersen argued that discrepancies between the computational analysis work of one study they cited and experimental results is “strong evidence” of the absence of any purposeful manipulation of the virus. This argument should be dismissed as a reductionist fallacy, as it underestimates degrees of freedom and available types of computational analyses. Other scientists using molecular dynamics simulations showed that SARS-CoV-2 had a much higher binding affinity to human ACE2 receptors than SARS-CoV, with predictions in agreement with experiments.
The fact that Andersen’s discussion is flawed doesn’t say anything about the nature or the source of the virus. It, however, shows that their work can’t be considered conclusive and justifies further study on the origin and source of the virus.
There are many other reasons that justify investigating the Wuhan labs, and possibly even other labs in China that work with the same viruses. In particular, (a) the emergence of SARS-CoV-2 in a highly populated city in central China like Wuhan and close to the Wuhan CDC; (b) the existence of two labs in Wuhan that extensively sample bats and study coronaviruses; (c) the relatively close relationship between the SARS-CoV-2 virus and bat RaTG13 or BtCoV/4991 that the researchers obtained from bats in a cave that is 1,200 miles away from Wuhan, which suggests that SARS-CoV-2 progenitors came from the same Yunnan caves; (d) the widespread use of cell culture experiments in infectious disease transmission experiments that can allow closely related viruses to adapt to human receptors; (e) the use of chimeric coronaviruses in civil research with different backbones—the lack of knowledge of the pre-outbreak collections of the Wuhan labs justifies international inspections, and the diversity of bat ACE2 receptors can also obscure the origin of the virus as the spike proteins of natural bat coronaviruses are very diverse; (f) evidence of lax security and knowledge that lab accidents aren’t improbable; (g) evidence that not all sampled viruses are sequenced and published—the full BtCoV/4991 sequence hasn’t been published and remains a mystery despite ~99% similarity of the known portion to SARS-CoV-2, while that of RaTG13 was sampled in 2013 and published in 2020. (The large similarity of the small partial sequence of BtCoV/4991 [published in 2016] with SARS-CoV2 is evidently what motivated the WIV to release the sequence of RaTG13 which matches the known portion of BtCoV/4991. It has not been independently verified that the sequence uploaded for bat RaTG13 is accurate); (h) the available data doesn’t suggest that closely related SARS-CoV-2-like bat relatives are common among bats in China but unique to bats from a particular Yunnan area; (i) the available data doesn’t support the wet market hypothesis which prompted some lab accident deniers to propose the alternative farm source hypothesis.
The farm hypothesis is highly improbable as the bats that carry SARS-CoV-2-like coronaviruses are 1,200 miles away from Wuhan. It would have been a more probable cause had the outbreak started in the Yunnan province. Further, there is no circumstantial evidence to support the farm hypothesis or even suggest it; it’s pure speculation. A notable fact is that most bat species near Wuhan hibernate in December as pointed out by Lu et al. If the farm hypothesis was true, multiple spillovers in different cities would have taken place which is not suggested by the data, unless transmission within the intermediate species is improbable which would have made it much less likely for the outbreak to start in Wuhan from the first place. Before the farm hypothesis, there was the pangolin hypothesis which was rejected by experts because pangolins are critically endangered in many areas and it’s improbable that they acted as an intermediary, at least outside a lab.
The genome sequences of human SARS-CoV-2 in just nine early patients exhibited 1%-2% difference among the subjects. Samples of bat RaTG13, 96.2% similar to SARS-CoV-2, should be obtained, sequenced, and studied in cell culture as part of scientific verification efforts.
Scientific skepticism is not the same as propagating conspiracy theories. It’s important to acknowledge that it was Chinese scientists who first brought up the possibility of an accidental leak in a short letter. As has been pointed out by U.S. Sen. Tom Cotton, the available circumstantial evidence indeed suggests a lab leak, with the simplest scenario being the leak of a bat coronavirus closely related to SARS-CoV-2 from cell culture or animal model experiments after adapting to human/humanlike receptors. Investigators must carefully consider conflicts of interest of researchers, especially those who relentlessly promote Chinese government types of propaganda to protect personal interests that they don’t clearly acknowledge and their collaborations inside China. Researchers should also not be credulous and should follow systematic step-by-step approaches to avoid falling into traps of circular reasoning and repeating propaganda messaging that is controlled and spread by centralized governments.
In closing, it’s important to emphasize that science needs more evidence-based, objective research with technical rather than broad conclusions. Speculations are good for forming hypotheses but should never be presented as conclusions. The Andersen-type speculative conclusions are of questionable scientific value and make no useful contribution to available knowledge about the coronavirus pandemic. Emotions such as peer sympathy, anger, fear, personal self-interest, and partisan political attachments should all be put aside when investigating matters with broad consequences for global security and human health. While speculative conclusions of any kind may turn out to be true, science doesn’t give credit to speculations. Scientists shouldn’t play dice in their analysis and discussion.
Khaled Talaat is a postdoctoral scholar in nuclear engineering at the University of New Mexico. He has conducted research on multiple subjects including aerosols, radiological protection, and Generation IV lead-cooled fast reactors.