Enhancing SARS-CoV-2 Lineage Surveillance Through the Integration of a Simple and Direct QPCR-Based Protocol Adaptation with Established Machine Learning Algorithms
Authors
Affiliations
Emerging and evolving Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) lineages, adapted to changing epidemiological conditions, present unprecedented challenges to global public health systems. Here, we introduce an adapted analytical approach that complements genomic sequencing, applying a cost-effective quantitative polymerase chain reaction (qPCR)-based assay. Viral RNA samples from SARS-CoV-2 positive cases detected by diagnostic laboratories or public health network units in Ceará, Brazil, were tracked for genomic surveillance and analyzed by using paired-end sequencing combined with integrative genomic analysis. Validation of a key structural variation was conducted with gel electrophoresis for the presence of a specific () gene deletion within the "BE.9" lineages tracked. The analytical innovation of our method is the optimization of a simple intercalating dye-based qPCR assay through repositioning primers from the ARTIC v4.1 amplicon panel to detect large molecular patterns. This assay distinguishes between "BE.9" and "non-BE.9" lineages, particularly BQ.1, without the need for expensive probes or sequencing. The protocol was validated against lineage predictions from next-generation sequencing (NGS) using 525 paired samples, achieving 93.3% sensitivity, 95.1% specificity, and 92.4% agreement, as measured by Cohen's Kappa coefficient. Machine learning (ML) models were trained using the melting curves from intercalating dye-based qPCR of 1724 samples, enabling highly accurate lineage assignment. Among them, the support vector machine (SVM) model had the best performance and after fine-tuning showed ∼96.52% (333/345) accuracy in comparison to the test data set. Our integrated approach provides an adapted analytical method that is both cost-effective and scalable, suitable for rapid assessment of emerging variants, especially in resource-limited settings. In this work, the protocol is applied to improve the monitoring of SARS-CoV-2 sublineages but can be extended to track any key molecular signature, including large insertions and deletions (indels) commonly observed in pathogenic agent subtypes. By offering a complement to traditional sequencing methods and utilizing easily trainable machine learning algorithms, our methodology contributes to enhanced molecular surveillance strategies and supports global efforts in pandemic control.
COVID-19: Lessons Learned from Molecular and Clinical Research.
Rizzi M, Sainaghi P Int J Mol Sci. 2025; 26(2).
PMID: 39859329 PMC: 11765519. DOI: 10.3390/ijms26020616.
Mitigating bias in AI mortality predictions for minority populations: a transfer learning approach.
Gu T, Pan W, Yu J, Ji G, Meng X, Wang Y BMC Med Inform Decis Mak. 2025; 25(1):30.
PMID: 39825353 PMC: 11742213. DOI: 10.1186/s12911-025-02862-7.