We compiled a set of 69SARS-CoV genomes including 58 sampled from humans and 11 sampled from civets and raccoon dogs. 5 Comparisons of GC content across taxa. 6, e14 (2017). Lam, T. T. et al. Boni, M.F., Lemey, P., Jiang, X. et al. Nguyen, L.-T., Schmidt, H. A., Von Haeseler, A. BEAST inferences made use of the BEAGLE v.3 library68 for efficient likelihood computations. A third approach attempted to minimize the number of regions removed while also minimizing signals of mosaicism and homoplasy. Bruen, T. C., Philippe, H. & Bryant, D. A simple and robust statistical test for detecting the presence of recombination. The assumption of long-term purifying selection would imply that coronaviruses are in endemic equilibrium with their natural host species, horseshoe bats, to which they are presumably well adapted. The rate of genome generation is unprecedented, yet there is currently no coherent nor accepted scheme for naming the expanding . Pangolin was developed to implement the dynamic nomenclature of SARS-CoV-2 lineages, known as the Pango nomenclature. Proc. Mol. Nat. Add entries for pangolin-data/-assignment 1.18.1.1 (, Really add a document on testing strategy. Green boxplots show the TMRCA estimate for the RaTG13/SARS-CoV-2 lineage and its most closely related pangolin lineage (Guangdong 2019). To begin characterizing any ancestral relationships for SARS-CoV-2, NRRs of the genome must be identified so that reliable phylogenetic reconstruction and dating can be performed. Share . However, formal testing using marginal likelihood estimation41 does provide some evidence of a temporal signal, albeit with limited log Bayes factor support of 3 (NRR1), 10 (NRR2) and 3 (NRA3); see Supplementary Table 1. Biol. Google Scholar. Emerg. Virological.org http://virological.org/t/ncov-2019-codon-usage-and-reservoir-not-snakes-v2/339 (2020). 1, vev003 (2015). To employ phylogenetic dating methods, recombinant regions of a 68-genome sarbecovirus alignment were removed with three independent methods. 725422-ReservoirDOCS). Biol. PubMed In March, when covid cases began spiking around India, Bani Jolly went hunting for answers in the virus's genetic code. SARS-CoV-2 and RaTG13 are the most closely related (their most recent common ancestor nodes denoted by green circles), except in the 222-nt variable-loop region of the C-terminal domain (bar graphs at bottom). Sliding window analysis of changes in the patterns of sequence similarity between human SARS-CoV-2, and pangolin and bat coronaviruses as described further in Fig. We infer time-measured evolutionary histories using a Bayesian phylogenetic approach while incorporating rate priors based on mean MERS-CoV and HCoV-OC43 rates and with standard deviations that allow for more uncertainty than the empirical estimates for both viruses (see Methods). Background & objectives: Several phylogenetic classification systems have been devised to trace the viral lineages of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). It allows a user to assign a SARS-CoV-2 genome sequence the most likely lineage (Pango lineage) to SARS-CoV-2 query sequences. wrote the first draft of the manuscript, and all authors contributed to manuscript editing. Of the countries that have contributed SARS-CoV-2 data, 30% had genomes of this lineage. All sequence data analysed in this manuscript are available at https://github.com/plemey/SARSCoV2origins. Evol. On first examination this would suggest that that SARS-CoV-2 is a recombinant of an ancestor of Pangolin-2019 and RaTG13, as proposed by others11,22. J. Med Virol. If the latter still identified non-negligible recombination signal, we removed additional genomes that were identified as major contributors to the remaining signal. Med. Yuan, J. et al. Identification of diverse alphacoronaviruses and genomic characterization of a novel severe acute respiratory syndrome-like coronavirus from bats in China. Zhang, Y.-Z. In December 2019, a cluster of pneumonia cases epidemiologically linked to an open-air live animal market in the city of Wuhan (Hubei Province), China1,2 led local health officials to issue an epidemiological alert to the Chinese Center for Disease Control and Prevention and the World Health Organizations (WHO) China Country Office. PI signals were identified (with bootstrap support >80%) for seven of these eight breakpoints: positions 1,684, 3,046, 9,237, 11,885, 21,753, 22,773 and 24,628. However, for several reasons, nucleotide sequences may be generated that cover only the spike gene of SARS-CoV-2. Pangolin was developed to implement the dynamic nomenclature of SARS-CoV-2 lineages, known as the Pango nomenclature. (2020) with additional (and higher quality) snake coding sequence data and several miscellaneous eukaryotes with low genomic GC content failed to find any meaningful clustering of the SARS-CoV-2 with snake genomes (a). Google Scholar. According to GISAID . 26, 450452 (2020). performed codon usage analysis. Now, the two researchers used genomic sequencing to compare the DNA of the new coronavirus in humans with that in animals and found a 99% match with pangolins. Researchers in the UK had just set the scientific world . Anderson, K. G. nCoV-2019 codon usage and reservoir (not snakes v2). Did Pangolin Trafficking Cause the Coronavirus Pandemic? Press, H.) 3964 (Springer, 2009). Genetic lineages of SARS-CoV-2 have been emerging and circulating around the world since the beginning of the COVID-19 pandemic. Bioinformatics 22, 26882690 (2006). Nature 558, 180182 (2018). These differences reflect the fact that rate estimates can vary considerably with the timescale of measurement, a frequently observed phenomenon in viruses known as time-dependent evolutionary rates41,43,44. Software package for assigning SARS-CoV-2 genome sequences to global lineages. Since experts have suggested that pangolins may be the reservoir species for COVID-19, the scaly anteater has been catapulted into headlines, news reports, and conversationsand some are calling COVID-19 "the revenge of the . This underscores the need for a global network of real-time human disease surveillance systems, such as that which identified the unusual cluster of pneumonia in Wuhan in December 2019, with the capacity to rapidly deploy genomic tools and functional studies for pathogen identification and characterization. 24, 490502 (2016). Virology 507, 110 (2017). The presence in pangolins of an RBD very similar to that of SARS-CoV-2 means that we can infer this was also probably in the virus that jumped to humans. 1c). Early detection via genomics was not possible during Southeast Asias initial outbreaks of avian influenza H5N1 (1997 and 20032004) or the first SARS outbreak (20022003). The histogram allows for the identification of non-recombining regions (NRRs) by revealing regions with no breakpoints. We thank T. Bedford for providing M.F.B. 25, 3548 (2017). This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. The coverage threshold and consensus sequence generation threshold were set to 20 and 90 respectively. 35, 247251 (2018). 1a-c ), has the third-highest number of confirmed COVID-19 cases in the state of So. Mol. Bayesian phylogenetic and phylodynamic data integration using BEAST 1.10. N. Engl. J. Infect. All custom code used in the manuscript is available at https://github.com/plemey/SARSCoV2origins. Publishers note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Unfortunately, a response that would achieve containment was not possible. Preprint at https://doi.org/10.1101/2020.05.28.122366 (2020). Biol. BEAGLE 3: improved performance, scaling, and usability for a high-performance computing library for statistical phylogenetics. Biol. RegionC showed no PI signals within it. For the HCoV-OC43, MERS-CoV and SARS datasets we specified flexible skygrid coalescent tree priors. Root-to-tip divergence as a function of sampling time for non-recombinant regions NRR1 and NRR2 and recombination-masked alignment set NRA3. Evolutionary rate estimation can be profoundly affected by the presence of recombination50. Posterior distributions were approximated through Markov chain Monte Carlo sampling, which were run sufficiently long to ensure effective sampling sizes >100. Ji, W., Wang, W., Zhao, X., Zai, J. matics program called Pangolin was developed. Temporal signal was tested using a recently developed marginal likelihood estimation procedure41 (Supplementary Table 1). A hypothesis of snakes as intermediate hosts of SARS-CoV-2 was posited during the early epidemic phase54, but we found no evidence of this55,56; see Extended Data Fig. 36, 7597 (2002). Gorbalenya, A. E. et al. The SARS-CoV divergence times are somewhat earlier than dates previously estimated15 because previous estimates were obtained using a collection of SARS-CoV genomes from human and civet hosts (as well as a few closely related bat genomes), which implies that evolutionary rates were predominantly informed by the short-term SARS outbreak scale and probably biased upwards. 94, e0012720 (2020). Use the Previous and Next buttons to navigate the slides or the slide controller buttons at the end to navigate through each slide. EPI_ISL_410721) and Beijing Institute of Microbiology and Epidemiology (W.-C. Cao, T.T.-Y.L., N. Jia, Y.-W. Zhang, J.-F. Jiang and B.-G. Jiang, nos. Indeed, the rates reported by these studies are in line with the short-term SARS rates that we estimate (Fig. Green boxplots show the TMRCA estimate for the RaTG13/SARS-CoV-2 lineage and its most closely related pangolin lineage (Guangdong 2019), with the light and dark coloured version based on the HCoV-OC43 and MERS-CoV centred priors, respectively. EPI_ISL_410538, EPI_ISL_410539, EPI_ISL_410540, EPI_ISL_410541 and EPI_ISL_410542) for the use of sequence data via the GISAID platform. The 2009 influenza pandemic and subsequent outbreaks of MERS-CoV (2012), H7N9 avian influenza (2013), Ebola virus (2014) and Zika virus (2015) were met with rapid sequencing and genomic characterization. 62,63), the GTR+ model and 100bootstrap replicateswas inferred for each BFR >500nt. This new approach classifies the newly sequenced genome against all the diverse lineages present instead of a representative select sequences. Dis. Sorting these breakpoint-free regions (BFRs) by length results in two segments >5kb: an ORF1a subregion spanning nucleotides (nt) 3,6259,150 and the first half of ORF1b spanning nt13,29119,628 (sequence numbering given in Source Data, https://github.com/plemey/SARSCoV2origins). The shaded region corresponds to the Sprotein. Based on the identified breakpoints in each genome, only the major non-recombinant region is kept in each genome while other regions are masked. Schierup, M. H. & Hein, J. Recombination and the molecular clock. The key to successful surveillance is knowing which viruses to look for and prioritizing those that can readily infect humans47. Two other bat viruses (CoVZXC21 and CoVZC45) from Zhejiang Province fall on this lineage as recombinants of the RaTG13/SARS-CoV-2 lineage and the clade of Hong Kong bat viruses sampled between 2005 and 2007 (Fig. These rate priors are subsequently used in the Bayesian inference of posterior rates for NRR1, NRR2, and NRA3 as indicated by the solid arrows. Press, 2009). Mol. The construction of NRR1 is the most conservative as it is least likely to contain any remaining recombination signals. PubMed Central 1 Phylogenetic relationships in the C-terminal domain (CTD). Abstract. Because there is no single accepted method of inferring breakpoints and identifying clean subregions with high certainty, we implemented several approaches to identifying three classic statistical signals of recombination: mosaicism, phylogenetic incongruence and excessive homoplasy51. Specifically, we used a combination of six methods implemented in v.5.5 of RDP5 (ref. One study suggests that over a century ago, one lineage of coronavirus circulating in bats gave rise to SARS-CoV-2, RaTG13 and a Pangolin coronavirus known as Pangolin-2019, Live Science . Microbiol. Uncertainty measures are shown in Extended Data Fig. As illustrated by the dashed arrows, these two posteriors motivate our specification of prior distributions with standard deviations inflated 10-fold (light color). Coronavirus: Pangolins may have spread the disease to humans P.L. 1) and thus likely to be the product of recombination, acquiring a divergent variable loop from a hitherto unsampled bat sarbecovirus28. 36, 17931803 (2019). This long divergence period suggests there are unsampled virus lineages circulating in horseshoe bats that have zoonotic potential due to the ancestral position of the human-adapted contact residues in the SARS-CoV-2 RBD. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the causative agent for the current coronavirus disease (COVID-19) pandemic that has affected more than 35 million people and caused . 4), but also by markedly different evolutionary rates. This study provides an integration of existing classifications and describes evolutionary trends of the SARS-CoV . One geographic clade includes viruses from provinces in southern China (Guangxi, Yunnan, Guizhou and Guangdong), with its major sister clade consisting of viruses from provinces in northern China (Shanxi, Henan, Hebei and Jilin) as well as Hubei Province in central China and Shaanxi Province in northwestern China. Evol. Center for Infectious Disease Dynamics, Department of Biology, Pennsylvania State University, University Park, PA, USA, Department of Microbiology, Immunology and Transplantation, KU Leuven, Rega Institute, Leuven, Belgium, Department of Biological Sciences, Xian Jiaotong-Liverpool University, Suzhou, China, State Key Laboratory of Emerging Infectious Diseases, School of Public Health, The University of Hong Kong, Hong Kong SAR, China, Department of Biology, University of Texas Arlington, Arlington, TX, USA, Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK, MRC-University of Glasgow Centre for Virus Research, Glasgow, UK, You can also search for this author in Nature 538, 193200 (2016). Thank you for visiting nature.com. Wang, H., Pipes, L. & Nielsen, R. Synonymous mutations and the molecular evolution of SARS-Cov-2 origins. =0.00025. A.R. Five example sequences with incongruent phylogenetic positions in the two trees are indicated by dashed lines. 95% credible interval bars are shown for all internal node ages. Cov-Lineages This boundary appears to be rarely crossed. 36) (RDP, GENECONV, MaxChi, Bootscan, SisScan and 3SEQ) and considered recombination signals detected by more than two methods for breakpoint identification. Sci. TMRCA estimates for SARS-CoV-2 and SARS-CoV from their respective most closely related bat lineages are reasonably consistent for the different data sets and different rate priors in our analyses. We compare both MERS-CoV- and HCoV-OC43-centred prior distributions (Extended Data Fig. We call this approach breakpoint-conservative, but note that this has the opposite effect to the construction of NRR1 in that this approach is the most likely to allow breakpoints to remain inside putative non-recombining regions. The fact that these estimates lie between the rates for MERS-CoV and HCoV-OC43 is consistent with the intermediate sampling time range of about 18years (Fig. SARS-like WIV1-CoV poised for human emergence. 91, 10581062 (2010). & Holmes, E. C. A genomic perspective on the origin and emergence of SARS-CoV-2. Genet. and T.A.C. To evaluate the performance procedure, we confirmed that the recombination masking resulted in (1) a markedly different outcome of the PHI test64, (2) removal of well-supported (bootstrap value >95%) incompatible splits in Neighbor-Net65 and (3) a near-complete reduction of mosaic signal as identified by 3SEQ. G066215N, G0D5117N and G0B9317N)) and by the European Unions Horizon 2020 project MOOD (no. However, on closer inspection, the relative divergences in the phylogenetic tree (Fig. Med. Avian influenza a virus (H7N7) epidemic in The Netherlands in 2003: course of the epidemic and effectiveness of control measures. Virological.org http://virological.org/t/ncovs-relationship-to-bat-coronaviruses-recombination-signals-no-snakes-no-evidence-the-2019-ncov-lineage-is-recombinant/331 (2020). A SARS-like cluster of circulating bat coronaviruses shows potential for human emergence. Lancet 395, 949950 (2020). Virus Evol. Biazzo et al. Isolation and characterization of a bat SARS-like coronavirus that uses the ACE2 receptor. We demonstrate that the sarbecoviruses circulating in horseshoe bats have complex recombination histories as reported by others15,20,21,22,23,24,25,26. PubMed Central Evol. performed Srecombination analysis. Because the SARS-CoV-2 S protein has been implicated in past recombination events or possibly convergent evolution12, we specifically investigated several subregions of the Sproteinthe N-terminal domain of S1, the C-terminal domain of S1, the variable-loop region of the C-terminal domain, and S2. Trova, S. et al. Mol. CAS 13, e1006698 (2017). Accurate estimation of ages for deeper nodes would require adequate accommodation of time-dependent rate variation. 3). The pangolin coronaviruses show lower similarity to SARS-CoV-2 than bat coronavirus RaTG13 across the whole genome, but higher similarity in the spike receptor binding domain, although the similarity at either scale remains too low to implicate . Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage - Nature To obtain Genetics 172, 26652681 (2006). Divergence time estimates based on the HCoV-OC43-centred rate prior for the separate BFRs (Supplementary Table 3) show consistency in TMRCA estimates across the genome. Our approach resulted in similar posterior rates using two different prior means, implying that the sarbecovirus data do inform the rate estimate even though a root-to-tip temporal signal was not apparent. Prolonged SARS-CoV-2 Infection and Intra-Patient Viral Evolu : The Scientists defined the pangolin lineage of this variant to be B.1.1.523 and it was originally recognized as a variant under monitoring on July 14, 2021. The proximal origin of SARS-CoV-2 | Nature Medicine In our second stage, we wanted to construct non-recombinant regions where our approach to breakpoint identification was as conservative as possible. 23, 18911901 (2006). 4, vey016 (2018). Aside from RaTG13, Pangolin-CoV is the most closely related CoV to SARS-CoV-2. The most parsimonious explanation for these shared ACE2-specific residues is that they were present in the common ancestors of SARS-CoV-2, RaTG13 and Pangolin Guangdong 2019, and were lost through recombination in the lineage leading to RaTG13. J. Virol. COVID-19 lineage names can be confusing to navigate; there are many aliases and if you want to catch them all to examine further in data analyses it helps to Allen O'Brien on LinkedIn: #r #rstudio #rstats #pangolin #covid19 #datascience #epidemiology Because coronaviruses are known to be highly recombinant, we used three different approaches to identify non-recombinant regions for use in our Bayesian time-calibrated phylogenetic inference. Lond. CAS 26 March 2020. ISSN 2058-5276 (online). PubMed 84, 31343146 (2010). The sizes of the black internal node circles are proportional to the posterior node support. The genetic distances between SARS-CoV-2 and RaTG13 (bottom) demonstrate that their relationship is consistent across all regions except for the variable loop. Stegeman, A. et al. Divergence time estimates based on the three regions/alignments where the effects of recombination have been removed. Divergence dates between SARS-CoV-2 and the bat sarbecovirus reservoir were estimated as 1948 (95% highest posterior density (HPD): 18791999), 1969 (95% HPD: 19302000) and 1982 (95% HPD: 19482009), indicating that the lineage giving rise to SARS-CoV-2 has been circulating unnoticed in bats for decades. Identifying SARS-CoV-2-related coronaviruses in Malayan pangolins. In the meantime, to ensure continued support, we are displaying the site without styles Identifying SARS-CoV-2-related coronaviruses in Malayan pangolins N. China corresponds to Jilin, Shanxi, Hebei and Henan provinces, and the N. China clade also includes one sequence sampled in Hubei Province in 2004. The time-calibrated phylogeny represents a maximum clade credibility tree inferred for NRR1. The latter was reconstructed using IQTREE66 v.2.0 under a general time-reversible (GTR) model with a discrete gamma distribution to model inter-site rate variation. We use three bioinformatic approaches to remove the effects of recombination, and we combine these approaches to identify putative non-recombinant regions that can be used for reliable phylogenetic reconstruction and dating. Biol. With horseshoe bats currently the most plausible origin of SARS-CoV-2, it is important to consider that sarbecoviruses circulate in a variety of horseshoe bat species with widely overlapping species ranges57. Evol. A., Lytras, S., Singer, J. PDF single centre retrospective study & Li, X. Crossspecies transmission of the newly identified coronavirus 2019nCoV. In our analyses of the sarbecovirus datasets, we incorporated the uncertainty of the sampling dates when exact dates were not available. Epidemiology, genetic recombination, and pathogenesis of coronaviruses. Researchers have found that SARS-CoV-2 in humans shares about 90.3% of its genome sequence with a coronavirus found in pangolins (Cyranoski, 2020). Microbiol. Methods Ecol. Extended Data Fig. Zhou, P. et al. Of importance for future spillover events is the appreciation that SARS-CoV-2 has emerged from the same horseshoe bat subgenus that harbours SARS-like coronaviruses. . 2). PureBasic 53 13 constellations Public Python 42 17 27) receptors and its RBD being genetically closer to a pangolin virus than to RaTG13 (refs. All four of these breakpoints were also identified with the tree-based recombination detection method GARD35. The red and blue boxplots represent the divergence time estimates for SARS-CoV-2 (red) and the 2002-2003 SARS-CoV (blue) from their most closely related bat virus, with the light- and dark-colored versions based on the HCoV-OC43 and MERS-CoV centered priors, respectively. Don't blame pangolins, coronavirus family tree tracing could prove key When viewing the last 7kb of the genome, a clade of viruses from northern China appears to cluster with sequences from southern Chinese provinces but, when inspecting trees from different parts of ORF1ab, the N. China clade is phylogenetically separated from the S. China clade. Pangolin relies on a novel algorithm called pangoLEARN. The presence of SARS-CoV-2-related viruses in Malayan pangolins, in silico analysis of the ACE2 receptor polymorphism and sequence similarities between the Receptor Binding Domain (RBD) of the spike proteins of pangolin and human Sarbecoviruses led to the proposal of pangolin as intermediary. 382, 11991207 (2020). Since the release of Version 2.0 in July 2020, however, it has used the 'pangoLEARN' machine-learning-based assignment algorithm to assign lineages to new SARS-CoV-2 genomes. Given that these pangolin viruses are ancestral to the progenitor of the RaTG13/SARS-CoV-2 lineage, it is more likely that they are also acquiring viruses from bats. J. Virol. The coronavirus genome that these researchers had assembled, from pangolin lung-tissue samples, contained some gene regions that were ninety-nine per cent similar to equivalent parts of the SARS . Nat. a, Breakpoints identified by 3SEQ illustrated by percentage of sequences (out of 68) that support a particular breakpoint position. It allows a user to assign a SARS-CoV-2 genome sequence the most likely lineage (Pango lineage) to SARS-CoV-2 query sequences. Holmes, E. C. The Evolution and Emergence of RNA Viruses (Oxford Univ. The ongoing pandemic spread of a new human coronavirus, SARS-CoV-2, which is associated with severe pneumonia/disease (COVID-19), has resulted in the generation of tens of thousands of virus . 5 (NRR1) are conservative in the sense that NRR1 is more likely to be non-recombinant than NRR2 or NRA3. Phylogenetic classification of the whole-genome sequences of SARS-CoV-2 Biol. and D.L.R. The inset represents divergence time estimates based on NRR1, NRR2 and NRA3. USA 113, 30483053 (2016). Preprint at https://doi.org/10.1101/2020.04.20.052019 (2020). However, inconsistency in the nomenclature limits uniformity in its epidemiological understanding. J. Gen. Virol. acknowledges support by the Research FoundationFlanders (Fonds voor Wetenschappelijk OnderzoekVlaanderen (nos. D.L.R. Wang, L. et al. 31922087). Softw. All three approaches to removal of recombinant genomic segments point to a single ancestral lineage for SARS-CoV-2 and RaTG13. PubMed Central 30, 21962203 (2020). Scientists trying to trace the ancestry of SARS-CoV-2, the virus responsible for COVID-19, have found the pangolin is unlikely to be the source of the virus responsible for the current pandemic. Impact of SARS-CoV-2 Gamma lineage introduction and COVID-19 - Nature RegionsB and C span nt3,6259,150 and 9,26111,795, respectively. https://doi.org/10.1038/s41564-020-0771-4, DOI: https://doi.org/10.1038/s41564-020-0771-4. An initial genomic sequence analysis found that the reemergence of COVID-19 in New Zealand was caused by a SARS-CoV-2 from the (now ancestral) lineage B.1.1.1 of the pangolin nomenclature ( 17 ). Except for specifying that sequences are linear, all settings were kept to their defaults. Although the human ACE2-compatible RBD was very likely to have been present in a bat sarbecovirus lineage that ultimately led to SARS-CoV-2, this RBD sequence has hitherto been found in only a few pangolin viruses. 11,12,13,22,28)a signal that suggests recombinationthe divergence patterns in the Sprotein do not show evidence of recombination between the lineage leading to SARS-CoV-2 and known sarbecoviruses. Aiewsakun, P. & Katzourakis, A. Time-dependent rate phenomenon in viruses. Katoh, K., Asimenos, G. & Toh, H. in Bioinformatics for DNA Sequence Analysis (ed. 1c). D.L.R. Li, X. et al. In the absence of a strong temporal signal, we sought to identify a suitable prior rate distribution to calibrate the time-measured trees by examining several coronaviruses sampled over time, including HCoV-OC43, MERS-CoV, and SARS-CoV virus genomes. 21, 255265 (2004). By mid-January 2020, the virus was spreading widely within Hubei province and by early March SARS-CoV-2 was declared a pandemic8. CNN . B 281, 20140732 (2014). R. Soc. Virus Evol. We compiled a dataset including 27human coronavirus OC43 virus genomes and ten related animal virus genomes (six bovine, three white-tailed deer and one canine virus). 5. Evol. Forni, D., Cagliani, R., Clerici, M. & Sironi, M. Molecular evolution of human coronavirus genomes. Correspondence to The species Severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2. Developed by the Centre for Genomic Pathogen Surveillance. eLife 7, e31257 (2018). While there is involvement of other mammalian speciesspecifically pangolins for SARS-CoV-2as a plausible conduit for transmission to humans, there is no evidence that pangolins are facilitating adaptation to humans. 87, 62706282 (2013). https://doi.org/10.1093/molbev/msaa163 (2020). performed recombination and phylogenetic analysis and annotated virus names with geographical and sampling dates. Lemey, P., Minin, V. N., Bielejec, F., Pond, S. L. K. & Suchard, M. A. Evol. Rambaut, A., Lam, T. T., Carvalho, L. M. & Pybus, O. G. Exploring the temporal structure of heterochronous sequences using TempEst (formerly Path-O-Gen). PubMed July 26, 2021. # File containing the ID of the samples, the Sequence of the haplotype, the Continent, the country, the Region, the Data, the Lineage of Pangolin and Nextstrain clade, and the haplotype number # In this order # Could be obtained from the database & Holmes, E. C. Recombination in evolutionary genomics. A new coronavirus associated with human respiratory disease in China. Pangolins: What are they and why are they linked to Covid-19? - Inverse Influenza viruses reassort17 but they do not undergo homologous recombination within RNA segments18,19, meaning that origins questions for influenza outbreaks can always be reduced to origins questions for each of influenzas eight RNA segments. & Andersen, K. G. The evolution of Ebola virus: insights from the 20132016 epidemic.