The disease coronavirus COVID-19 has been the cause of millions of deaths worldwide. Among the proteins of SARS-CoV-2, non-structural protein 12 (NSP12) plays a key role during COVID infection and is part of the RNA-dependent RNA polymerase complex. The monitoring of NSP12 polymorphisms is extremely important for the design of new antiviral drugs and monitoring of viral evolution. This study analyzed the NSP12 mutations detected in circulating SARS-CoV-2 during the years 2020 to 2022 in the population of the city of Manaus, Amazonas, Brazil. The most frequent mutations found were P323L and G671S. Reports in the literature indicate that these mutations are related to transmissibility efficiency, which may have contributed to the extremely high numbers of cases in this location. In addition, two mutations described here (E796D and R914K) are close and have RMSD that is similar to the mutations M794V and N911K, which have been described in the literature as influential on the performance of the NSP12 enzyme. These data demonstrate the need to monitor the emergence of new mutations in NSP12 in order to better understand their consequences for the treatments currently used and in the design of new drugs.
The emergence of the new coronavirus has brought changes of a magnitude never seen by humans. By September 2023, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) had caused 6,959,316 deaths worldwide, leading to the collapse of the healthcare system in a number of countries due to the need for hospitalization of seriously ill patients (WHO 2023). In Brazil, 37,796,956 cases and 705,775 deaths were registered by September 2023, with Manaus, Amazonas, being one of the most affected cities (BRASIL - Ministério da Saúde 2023).
During coronavirus replication, RdRp (RNA-dependent RNA polymerase) plays a crucial role in viral genome transcription and replication. This complex is composed of several proteins, including nonstructural protein 12 (NSP12), a catalytic subunit with RdRp activity, along with NSP7 and NSP8, which stimulate the polymerization activity of NSP12. The resulting complex, known as NSP12-NSP7-NSP8, with about 160 kDa, is considered the minimum nucleus necessary for the amplification of viral RNA (Kirchdoerfer & Ward 2019).
Different substances that act on the inhibition of RdRp have been tested for use in the treatment of SARS-CoV-2. Among these is favipiravir, which is a pyrazinecarboxamide derivative that is used against several RNA viruses such as influenza, yellow fever, flaviviruses, among others (Furuta et al. 2009, Shiraki & Daikoku 2020). With the emergence of new strains of COVID-19, new mutations are being found in different parts of the genome of this microorganism. As the pandemic progressed, the scientific community noticed that there was a significant increase in the frequency of mutations in the genome, reaching around 3,000 mutations in isolates worldwide, with the spike and RdRp genes having the highest mutation rates (Flores-Alanis et al. 2021, Eskier et al. 2021, Mercatelli & Giorgi 2020, Koçhan et al. 2021). Therefore, it is extremely important to continue to monitor these variants, as well as understand the consequences caused in the structure of these proteins, for example, for the design of new antiviral drugs.
In this context, this article aims to analyze the structural changes caused by mutations in NSP12 found in Manaus, Amazonas, Brazil, which were collected during the pandemic caused by the new coronavirus. These mutations have not been previously reported. The modeling and molecular dynamics analyses showed that, although no significant alterations were observed in the structure of NSP12 of these mutants, the positions of the mutations indicate alterations in the mode of action of NSP12 and, consequently, viral replication.
Between 2020 and 2022, 2,356 samples were collected from patients located in Manaus, Amazonas state, Brazil. This study was approved by the Ethics Committee of the Amazonas State University, under approval number: CAAE: 25430719.6.0000.5016.
The NSP12 gene was sequenced following the protocol described by Nascimento et al. (2020), with modifications. Gene amplification was obtained using Master Mix Platinum SuperFi II Green PCR, which was confirmed using agarose gel electrophoresis, then precipitated with PEG 8000 and quantified via fluorimetry. NexteraXT DNA was used to construct NGS libraries, and sequencing was done using the MiSeq v2 reagent kit (500 cycles) and MiSeq (Illumina), installed at Fiocruz Amazônia. The Illumina pipeline in BaseSpace was used to convert raw data; cut for quality using BBDuk and assembled using BBMap in Geneious v10.2.6, and using the sequence NC_045512 as the reference. The alignment was performed using MAFFT 7 in Geneious v10.2.6.
A phylogenetic tree was constructed to infer clades using the diversity in NSP12 amino acid sequences. The maximum likelihood method and a JTT matrix-based model (Jones et al. 1992) were used in the MEGA 11 package (Tamura et al. 2021). This analysis involved 236 sequences (including NC_045512.2 from GenBank; Wu et al. 2020) of 923 amino acids and 177 variable positions.
Search models, modeling and structural analysis of the SARS-CoV-2 complex NSP12-NSP7-NSP8-RNA were carried out using Blast software (Camacho et al. 2009), Modeller 10.3 (Eswar et al. 2008), Swiss - PDBViewer v4.1 (Kaplan & Littlejohn 2001) and UCSF Chimera (Pettersen et al. 2009). The NSP12 sequences of SARS-CoV-2 were obtained by our group and the Wuhan isolate was obtained from GenBank (NC_045512.2) (Wu et al. 2020). In summary, the structure of the SARS-CoV-2 NSP12 complex of NSP12-NSP7-NSP8-RNA (PDB identification code: 7AAP) (Naydenova et al. 2021) was used as a mold. Before modeling the mutations, it was necessary to model the 896-910 and C-terminal regions of the Wuhan NSP12 structure (chain A of 7AAP). Next, the main structures of interest with mutations found in Manaus and reported by Miropolskaya et al. (2023) were modeled. The classic pipeline of the generation of 1,000 models was applied in the Modeller 10.3 software. After modeling, a common preparation and minimization protocol was applied for the simulation of molecular dynamics in Gromacs 2023.2 (Van Der Spoel et al. 2005). All mutant and wild-type structures (PDB ID: 7AAP) were submitted. The protocol consists of placing each structure in a solvated cubic box with edges 1 nm from the edge of the structural complex. Then, the structures were joined and neutralized with the addition of Na+ ions. Two minimization steps were applied (steepest descent and gradient conjugate) until the complex reached a resistance of less than 1,000 kJ.mol-1.nm-1. The temperature of the system was then increased from 50 K to 300 K in 5 steps (50 K to 100 K, 100 K to 150 K, 150 K to 200 K, 200 K to 250 K, 250 K to 300 K), and the velocities at each step were reassigned according to the Maxwell-Boltzmann distribution at that temperature. The systems were then subjected to short molecular dynamics with position constraints for a period of 30 ps. Finally, the system was subjected to 20 ps of balance with NVT set plus 20 ps NPT. The quality of all the final models was evaluated using the Procheck software (Yuan et al. 2020). All the structures showed structural quality with more than 90% of the amino acids in favorable regions. None of the amino acids were in the prohibited region. All the final structures were superimposed on the Wuhan structure and the RMSDs were calculated. Theinterstructure of the RMSDs of Miropolskaya et al. (2023) and those from Manaus were compared. Structural positions were observed and analysed using the UCSF Chimera software. In order to compare the structural differences, the structure of SARS-CoV NSP12 (GI: AFM43866.1) was modeled using homology and superimposed on SARS-CoV-2 NSP12 (chain A of the PDB ID: 7AAP).
The NSP12 gene sequences were extracted from the 235 SARS-CoV-2 genomes of samples of COVID-19 patients, which were obtained in Manaus between 2020-2022. The translated sequences contained 923 amino acids and the positions of the sites were determined based on the sequence of the Wuhan NC NC_045512.2 (Wu et al. 2020). Of the 177 variable sites, 38 presented non-unique mutations and in five of them the frequency was higher than 5: P94L (06) and P94S (03); P323L (233) and P323F (1); A529T (2), A529V (02) and A529S (01); G671S (89) and G671C (1); L838I (06) and L838M (1). These mutations can have significant functional implications and the analysis of these mutations provides valuable information for treatment and vaccine development studies, as described by Sahin et al. (2021), who found four new mutations in NSP12 (V111A, H133R, Y453C, M626K).
The main mutations in the dataset (considering Wuhan as the reference) are shown in Figure 1. In position 323, it is possible to observe that the Wuhan reference proline was absent in the studied population, with 233 samples presenting leucine and a single sample presenting leucine/phenylalanine (red bar in Figure 1). The second most frequent occurrence was at position 671, in which glycine was replaced by serine in 89 cases and by cysteine in only one case (dark orange bar in Figure 1, names in blue in the phylogenetic tree of Supplementary Material - Figure S1). This group of samples containing G671S does not comprise a monophyletic cluster (names in blue; Figure S1). Another three sites were presented as polymorphic, but in no more than 3% of the strains. The effect of the distribution of these main mutations on the phylogenetic relationships between the studied lineages is marked by the tendency to cluster the lineages that share these apomorphies into small monophyletic clusters (names in green, orange and red; Figure S1). When working with a ferret model, Kim et al. (2023) observed that the NSP12 P323L or P323L/G671S mutation of SARS-CoV-2 is associated with increased stability of the RdRp complex and enzymatic activity, thus promoting efficient transmissibility. The high incidence of isolates containing these mutations in Manaus, Amazonas, may have contributed to the high number of cases observed in this location. In association with this fact, Manaus did not have a satisfactory lockdown system, which led to one of the largest public health disasters ever seen in the Amazon region. The high circulation of the virus in the region greatly contributed to the appearance of the observed mutations (Taylor 2021, Sabino et al. 2021, Naveca et al. 2021).
NSP12 positions with more than one mutation found in sequences from Manaus, Amazonas (Brazil). All the columns show positions with more than one mutation compared to the Wuhan strain. Positions with only one mutation were omitted.
After the modeling and minimization process, the structures of the SARS-CoV-2 NSP12-NSP7-NSP8-RNA complexes selected for analysis were superimposed and their deviations compared using the root mean square deviation (RMSD) (Table I). The analyses carried out revealed that the mutations described by Miropolskaya et al. (2023) present a RMSD that is similar to those described in our study. The most frequent alterations described in the literature (G671S and P323L) included a lower RMSD than those observed in the mutations described in our study and in Miropolskaya et al. (2023). Mutations in the sequence of NSP12 can alter its activity, which can lead to an improvement in its synthesis capacity and, consequently, a better adaptation or a higher viral replication rate. Through in vitro experiments, Miropolskaya et al. (2023) demonstrated that M794V and N911K mutations lead to an increase in viral RNA replication. In Figure 2, we observe that the E796D mutation found in Manaus is close to the M794V mutation; while, in Figure 3, the R914K mutation is close to N911K. Due to its proximity to the catalytic site of RNA polymerization (asparagine triads 618, 760 and 761), these mutations could have a direct influence on enzyme activity. Despite the possible relationship between the E796D and R914K mutations and enzymatic activities, both were singletons in our sample and did not present any cluster or clade formation in the topology of the phylogenetic analysis (red branches in Figure S1).
Overlaps between modeled structural complexes of NSP12-NSP7-NSP8 RNA from SARS-CoV-2 with M794V and E796D mutations in NSP12. Ribbon representation of the NSP12 complex (brown), NSP7 (blue), NSP8 (green) and RNA (red). The amino acid side chains of the Wuhan strain are shown in purple, the M794V mutation of Miropolskaya et al. (2023) in yellow, and E796D of this study in light green.
Overlaps between modeled structural complexes of NSP12-NSP7-NSP8 RNA from SARS-CoV-2 with N911K and R914K mutations in NSP12. Ribbon representation of the NSP12 complex (brown), NSP7 (blue), NSP8 (green) and RNA (red). The amino acid side chains of the Wuhan strain are shown in purple, the N911K mutation of Miropolskaya et al. (2023) in yellow, and R914K of this study in light green.
RMSD between modeled SARS-CoV-2 NSP12 of NSP12-NSP7-NSP8-RNA complexes with mutations in NSP12.
The RdRp P323L mutation was described by Chand et al. (2020), Kannan et al. (2020) and Eskier et al. (2020). The latter reported that the P323L mutation (14408C>T) leads to an elevation in the mutation rate, while 15324C>T causes the opposite. Other mutations have also been found in Europe and North America, such as Val473, Arg555 (Mari et al. 2021) and NSP12 L314P (Ward et al. 2021), suggesting that the virus is evolving (Pachetti et al. 2020). Goldswain et al. (2023) noted that P323L is an important contributor in the emergence of variants with transmission advantages. Kim et al. (2023) indicated that the NSP12 P323L OR P323L/G671S mutation of SARS-CoV-2 is associated with an increase in RdRp complex stability and enzyme activity, thus promoting efficient transmissibility. The P323L and G671S mutations are close to NSP8, while the R914K and E796D mutations found in the analyzed isolates from Manaus, as well as M794V and N911K (Miropolskaya et al. 2023) are not close to either the NSP7 or NSP8 protein. This indicates that they are probably not involved in facilitating or preventing the binding of this complex. However, it is still possible that some of the observed mutations can have some influence, either positive or negative, in this process, through disruptions that can break non-covalent interactions between structures that may be essential for the interaction of the complex or alter the folding energy (Gao et al. 2015). SARS-CoV NSP12 has 95% similarity to SARS-CoV-2 NSP12. In turn, structural overlap of globally aligned residues of the SARS-CoV NSP12 structure over the SARS-CoV-2 NSP12 structure resulted in a low RMSD of 0.583 Å, showing structural conservation of the backbone. However, when considering just the residues of the two loops (Figure S2), the RMSD is approximately 16 Å. This loop is close to the entry and exit channels of the RNA, which suggests that the catalytic mechanism is affected by this difference. This feature was also observed by Gao et al. (2020).
In conclusion, new local mutations were recorded in Manaus. The main mutations P323L and G671S have been reported in other studies as having transmissibility efficiency, which may have contributed to the extremely high numbers of cases in this region. Two other mutations (R914K and E796D) have alterations that have already cited as interfering with replication processes. Their high RMSD values from the modeling suggest that monitoring of SARS-CoV-2 NSP12 variants might include analyzing the influence of non-synonymous amino acid mutations on enzyme structure and activity. This highlights the importance of monitoring SARS-CoV-2 variants, as it may affect the effectiveness of treatment against this disease.
The authors would like to thank the Program for Technological Development in Tools for Health – FIOCRUZf or the use of the nucleotide sequencing facilities at ILMD-Fiocruz Amazônia.
