» Articles » PMID: 33228538

Intrinsic Laws of K-mer Spectra of Genome Sequences and Evolution Mechanism of Genomes

Overview
Journal BMC Evol Biol
Publisher Biomed Central
Specialty Biology
Date 2020 Nov 24
PMID 33228538
Citations 5
Authors
Affiliations
Soon will be listed here.
Abstract

Background: K-mer spectra of DNA sequences contain important information about sequence composition and sequence evolution. We want to reveal the evolution rules of genome sequences by studying the k-mer spectra of genome sequences.

Results: The intrinsic laws of k-mer spectra of 920 genome sequences from primate to prokaryote were analyzed. We found that there are two types of evolution selection modes in genome sequences, named as CG Independent Selection and TA Independent Selection. There is a mutual inhibition relationship between CG and TA independent selections. We found that the intensity of CG and TA independent selections correlates closely with genome evolution and G + C content of genome sequences. The living habits of species are related closely to the independent selection modes adopted by species genomes. Consequently, we proposed an evolution mechanism of genomes in which the genome evolution is determined by the intensities of the CG and TA independent selections and the mutual inhibition relationship. Besides, by the evolution mechanism of genomes, we speculated the evolution modes of prokaryotes in mild and extreme environments in the anaerobic age and the evolving process of prokaryotes from anaerobic to aerobic environment on earth as well as the originations of different eukaryotes.

Conclusion: We found that there are two independent selection modes in genome sequences. The evolution of genome sequence is determined by the two independent selection modes and the mutual inhibition relationship between them.

Citing Articles

Difference Analysis Among Six Kinds of Acceptor Splicing Sequences by the Dispersion Features of 6-mer Subsets in Human Genes.

Si Y, Li H, Li X Biology (Basel). 2025; 14(2).

PMID: 40001974 PMC: 11853274. DOI: 10.3390/biology14020206.


Distribution rules of 8-mer spectra and characterization of evolution state in animal genome sequences.

Li X, Li H, Yang Z, Wang L BMC Genomics. 2024; 25(1):855.

PMID: 39266973 PMC: 11391722. DOI: 10.1186/s12864-024-10786-1.


A survey of k-mer methods and applications in bioinformatics.

Moeckel C, Mareboina M, Konnaris M, Chan C, Mouratidis I, Montgomery A Comput Struct Biotechnol J. 2024; 23:2289-2303.

PMID: 38840832 PMC: 11152613. DOI: 10.1016/j.csbj.2024.05.025.


The determinants of the rarity of nucleic and peptide short sequences in nature.

Chantzi N, Mareboina M, Konnaris M, Montgomery A, Patsakis M, Mouratidis I NAR Genom Bioinform. 2024; 6(2):lqae029.

PMID: 38584871 PMC: 10993293. DOI: 10.1093/nargab/lqae029.


Frequentmers - a novel way to look at metagenomic next generation sequencing data and an application in detecting liver cirrhosis.

Mouratidis I, Chantzi N, Khan U, Konnaris M, Chan C, Mareboina M BMC Genomics. 2023; 24(1):768.

PMID: 38087204 PMC: 10714505. DOI: 10.1186/s12864-023-09861-w.


References
1.
Down T, Hubbard T . Computational detection and location of transcription start sites in mammalian genomic DNA. Genome Res. 2002; 12(3):458-61. PMC: 155284. DOI: 10.1101/gr.216102. View

2.
Kwok A, Su S, Reynolds R, Bay S, Av-Gay Y, Dovichi N . Species identification and phylogenetic relationships based on partial HSP60 gene sequences within the genus Staphylococcus. Int J Syst Bacteriol. 1999; 49 Pt 3:1181-92. DOI: 10.1099/00207713-49-3-1181. View

3.
Hirt R, Logsdon Jr J, Healy B, Dorey M, Doolittle W, Embley T . Microsporidia are related to Fungi: evidence from the largest subunit of RNA polymerase II and other proteins. Proc Natl Acad Sci U S A. 1999; 96(2):580-5. PMC: 15179. DOI: 10.1073/pnas.96.2.580. View

4.
Karlin S, Mrazek J . Compositional differences within and between eukaryotic genomes. Proc Natl Acad Sci U S A. 1997; 94(19):10227-32. PMC: 23344. DOI: 10.1073/pnas.94.19.10227. View

5.
Chen Y, Nyeo S, Yeh C . Model for the distributions of k-mers in DNA sequences. Phys Rev E Stat Nonlin Soft Matter Phys. 2005; 72(1 Pt 1):011908. DOI: 10.1103/PhysRevE.72.011908. View