» Articles » PMID: 40000933

GRAMEP: an Alignment-free Method Based on the Maximum Entropy Principle for Identifying SNPs

Overview
Publisher Biomed Central
Specialty Biology
Date 2025 Feb 25
PMID 40000933
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Advances in high throughput sequencing technologies provide a huge number of genomes to be analyzed. Thus, computational methods play a crucial role in analyzing and extracting knowledge from the data generated. Investigating genomic mutations is critical because of their impact on chromosomal evolution, genetic disorders, and diseases. It is common to adopt aligning sequences for analyzing genomic variations. However, this approach can be computationally expensive and restrictive in scenarios with large datasets.

Results: We present a novel method for identifying single nucleotide polymorphisms (SNPs) in DNA sequences from assembled genomes. This study proposes GRAMEP, an alignment-free approach that adopts the principle of maximum entropy to discover the most informative k-mers specific to a genome or set of sequences under investigation. The informative k-mers enable the detection of variant-specific mutations in comparison to a reference genome or other set of sequences. In addition, our method offers the possibility of classifying novel sequences with no need for organism-specific information. GRAMEP demonstrated high accuracy in both in silico simulations and analyses of viral genomes, including Dengue, HIV, and SARS-CoV-2. Our approach maintained accurate SARS-CoV-2 variant identification while demonstrating a lower computational cost compared to methods with the same purpose.

Conclusions: GRAMEP is an open and user-friendly software based on maximum entropy that provides an efficient alignment-free approach to identifying and classifying unique genomic subsequences and SNPs with high accuracy, offering advantages over comparative methods. The instructions for use, applicability, and usability of GRAMEP are open access at https://github.com/omatheuspimenta/GRAMEP .

References
1.
Zielezinski A, Vinga S, Almeida J, Karlowski W . Alignment-free sequence comparison: benefits, applications, and tools. Genome Biol. 2017; 18(1):186. PMC: 5627421. DOI: 10.1186/s13059-017-1319-7. View

2.
Garg S . Computational methods for chromosome-scale haplotype reconstruction. Genome Biol. 2021; 22(1):101. PMC: 8040228. DOI: 10.1186/s13059-021-02328-9. View

3.
Crick F . Central dogma of molecular biology. Nature. 1970; 227(5258):561-3. DOI: 10.1038/227561a0. View

4.
Khare S, Gurry C, Freitas L, Schultz M, Bach G, Diallo A . GISAID's Role in Pandemic Response. China CDC Wkly. 2021; 3(49):1049-1051. PMC: 8668406. DOI: 10.46234/ccdcw2021.255. View

5.
Janzakova K, Balafrej I, Kumar A, Garg N, Scholaert C, Rouat J . Structural plasticity for neuromorphic networks with electropolymerized dendritic PEDOT connections. Nat Commun. 2023; 14(1):8143. PMC: 10709651. DOI: 10.1038/s41467-023-43887-8. View