» Articles » PMID: 31511570

CsDMA: an Improved Bioinformatics Tool for Identifying DNA 6 mA Modifications Via Chou's 5-step Rule

Overview
Journal Sci Rep
Specialty Science
Date 2019 Sep 13
PMID 31511570
Citations 10
Authors
Affiliations
Soon will be listed here.
Abstract

DNA N-methyldeoxyadenosine (6 mA) modifications were first found more than 60 years ago but were thought to be only widespread in prokaryotes and unicellular eukaryotes. With the development of high-throughput sequencing technology, 6 mA modifications were found in different multicellular eukaryotes by using experimental methods. However, the experimental methods were time-consuming and costly, which makes it is very necessary to develop computational methods instead. In this study, a machine learning-based prediction tool, named csDMA, was developed for predicting 6 mA modifications. Firstly, three feature encoding schemes, Motif, Kmer, and Binary, were used to generate the feature matrix. Secondly, different algorithms were selected into the prediction model and the ExtraTrees model received the best AUC of 0.878 by using 5-fold cross-validation on the training dataset. Besides, the ExtraTrees model also received the best AUC of 0.893 on the independent testing dataset. Finally, we compared our method with state-of-the-art predictors and the results shown that our model achieved better performance than existing tools.

Citing Articles

PSATF-6mA: an integrated learning fusion feature-encoded DNA-6 mA methylcytosine modification site recognition model based on attentional mechanisms.

Kang Y, Wang H, Qin Y, Liu G, Yu Y, Zhang Y Front Genet. 2024; 15:1498884.

PMID: 39600317 PMC: 11588721. DOI: 10.3389/fgene.2024.1498884.


Biological Sequence Classification: A Review on Data and General Methods.

Ao C, Jiao S, Wang Y, Yu L, Zou Q Research (Wash D C). 2024; 2022:0011.

PMID: 39285948 PMC: 11404319. DOI: 10.34133/research.0011.


HormoNet: a deep learning approach for hormone-drug interaction prediction.

Emami N, Ferdousi R BMC Bioinformatics. 2024; 25(1):87.

PMID: 38418979 PMC: 10903040. DOI: 10.1186/s12859-024-05708-7.


Harnessing Current Knowledge of DNA N6-Methyladenosine From Model Plants for Non-model Crops.

Chachar S, Liu J, Zhang P, Riaz A, Guan C, Liu S Front Genet. 2021; 12:668317.

PMID: 33995495 PMC: 8118384. DOI: 10.3389/fgene.2021.668317.


6mA-Pred: identifying DNA N6-methyladenine sites based on deep learning.

Huang Q, Zhou W, Guo F, Xu L, Zhang L PeerJ. 2021; 9:e10813.

PMID: 33604189 PMC: 7866889. DOI: 10.7717/peerj.10813.


References
1.
Fu L, Niu B, Zhu Z, Wu S, Li W . CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012; 28(23):3150-2. PMC: 3516142. DOI: 10.1093/bioinformatics/bts565. View

2.
Qiu W, Sun B, Xiao X, Xu Z, Jia J, Chou K . iKcr-PseEns: Identify lysine crotonylation sites in histone proteins with pseudo components and ensemble classifier. Genomics. 2017; 110(5):239-246. DOI: 10.1016/j.ygeno.2017.10.008. View

3.
Chou K, Forsen S . Diffusion-controlled effects in reversible enzymatic fast reaction systems--critical spherical shell and proximity rate constant. Biophys Chem. 1980; 12(3-4):255-63. DOI: 10.1016/0301-4622(80)80002-0. View

4.
Casadesus J, Low D . Epigenetic gene regulation in the bacterial world. Microbiol Mol Biol Rev. 2006; 70(3):830-56. PMC: 1594586. DOI: 10.1128/MMBR.00016-06. View

5.
Chou K . Impacts of bioinformatics to medicinal chemistry. Med Chem. 2014; 11(3):218-34. DOI: 10.2174/1573406411666141229162834. View