» Articles » PMID: 37395403

Structural Underpinnings of Mutation Rate Variations in the Human Genome

Overview
Specialty Biochemistry
Date 2023 Jul 3
PMID 37395403
Authors
Affiliations
Soon will be listed here.
Abstract

Single nucleotide mutation rates have critical implications for human evolution and genetic diseases. Importantly, the rates vary substantially across the genome and the principles underlying such variations remain poorly understood. A recent model explained much of this variation by considering higher-order nucleotide interactions in the 7-mer sequence context around mutated nucleotides. This model's success implicates a connection between DNA shape and mutation rates. DNA shape, i.e. structural properties like helical twist and tilt, is known to capture interactions between nucleotides within a local context. Thus, we hypothesized that changes in DNA shape features at and around mutated positions can explain mutation rate variations in the human genome. Indeed, DNA shape-based models of mutation rates showed similar or improved performance over current nucleotide sequence-based models. These models accurately characterized mutation hotspots in the human genome and revealed the shape features whose interactions underlie mutation rate variations. DNA shape also impacts mutation rates within putative functional regions like transcription factor binding sites where we find a strong association between DNA shape and position-specific mutation rates. This work demonstrates the structural underpinnings of nucleotide mutations in the human genome and lays the groundwork for future models of genetic variations to incorporate DNA shape.

Citing Articles

Ensemble learning-based predictor for driver synonymous mutation with sequence representation.

Bi C, Shi Y, Xia J, Liang Z, Wu Z, Xu K PLoS Comput Biol. 2025; 21(1):e1012744.

PMID: 39761306 PMC: 11737855. DOI: 10.1371/journal.pcbi.1012744.


Towards the genomic sequence code of DNA fragility for machine learning.

Pflughaupt P, Abdullah A, Masuda K, Sahakyan A Nucleic Acids Res. 2024; 52(21):12798-12816.

PMID: 39441076 PMC: 11602142. DOI: 10.1093/nar/gkae914.


kmerDB: A database encompassing the set of genomic and proteomic sequence information for each species.

Mouratidis I, Baltoumas F, Chantzi N, Patsakis M, Chan C, Montgomery A Comput Struct Biotechnol J. 2024; 23:1919-1928.

PMID: 38711760 PMC: 11070822. DOI: 10.1016/j.csbj.2024.04.050.


C and G are frequently mutated into T and A in coding regions of human genes.

Wang Y, Chen K Mol Genet Genomics. 2024; 299(1):23.

PMID: 38431687 DOI: 10.1007/s00438-024-02118-5.


Predicting DNA structure using a deep learning method.

Li J, Chiu T, Rohs R Nat Commun. 2024; 15(1):1243.

PMID: 38336958 PMC: 10858265. DOI: 10.1038/s41467-024-45191-5.


References
1.
Demeulemeester J, Dentro S, Gerstung M, Van Loo P . Biallelic mutations in cancer genomes reveal local mutational determinants. Nat Genet. 2022; 54(2):128-133. PMC: 8837546. DOI: 10.1038/s41588-021-01005-8. View

2.
Bacolla A, Tainer J, Vasquez K, Cooper D . Translocation and deletion breakpoints in cancer genomes are associated with potential non-B DNA-forming sequences. Nucleic Acids Res. 2016; 44(12):5673-88. PMC: 4937311. DOI: 10.1093/nar/gkw261. View

3.
Karolak A, Levatic J, Supek F . A framework for mutational signature analysis based on DNA shape parameters. PLoS One. 2022; 17(1):e0262495. PMC: 8752002. DOI: 10.1371/journal.pone.0262495. View

4.
Kaushik Tiwari M, Adaku N, Peart N, Rogers F . Triplex structures induce DNA double strand breaks via replication fork collapse in NER deficient cells. Nucleic Acids Res. 2016; 44(16):7742-54. PMC: 5027492. DOI: 10.1093/nar/gkw515. View

5.
Velasco-Berrelleza V, Burman M, Shepherd J, Leake M, Golestanian R, Noy A . SerraNA: a program to determine nucleic acids elasticity from simulation data. Phys Chem Chem Phys. 2020; 22(34):19254-19266. DOI: 10.1039/d0cp02713h. View