» Articles » PMID: 38961062

Rockfish: A Transformer-based Model for Accurate 5-methylcytosine Prediction from Nanopore Sequencing

Overview
Journal Nat Commun
Specialty Biology
Date 2024 Jul 3
PMID 38961062
Authors
Affiliations
Soon will be listed here.
Abstract

DNA methylation plays an important role in various biological processes, including cell differentiation, ageing, and cancer development. The most important methylation in mammals is 5-methylcytosine mostly occurring in the context of CpG dinucleotides. Sequencing methods such as whole-genome bisulfite sequencing successfully detect 5-methylcytosine DNA modifications. However, they suffer from the serious drawbacks of short read lengths and might introduce an amplification bias. Here we present Rockfish, a deep learning algorithm that significantly improves read-level 5-methylcytosine detection by using Nanopore sequencing. Rockfish is compared with other methods based on Nanopore sequencing on R9.4.1 and R10.4.1 datasets. There is an increase in the single-base accuracy and the F1 measure of up to 5 percentage points on R.9.4.1 datasets, and up to 0.82 percentage points on R10.4.1 datasets. Moreover, Rockfish shows a high correlation with whole-genome bisulfite sequencing, requires lower read depth, and achieves higher confidence in biologically important regions such as CpG-rich promoters while being computationally efficient. Its superior performance in human and mouse samples highlights its versatility for studying 5-methylcytosine methylation across varied organisms and diseases. Finally, its adaptable architecture ensures compatibility with new versions of pores and chemistry as well as modification types.

Citing Articles

Transitioning from wet lab to artificial intelligence: a systematic review of AI predictors in CRISPR.

Abbasi A, Asim M, Dengel A J Transl Med. 2025; 23(1):153.

PMID: 39905452 PMC: 11796103. DOI: 10.1186/s12967-024-06013-w.


Methods for Detection and Mapping of Methylated and Hydroxymethylated Cytosine in DNA.

Kisil O, Sergeev A, Bacheva A, Zvereva M Biomolecules. 2024; 14(11).

PMID: 39595523 PMC: 11591845. DOI: 10.3390/biom14111346.


Rockfish: A transformer-based model for accurate 5-methylcytosine prediction from nanopore sequencing.

Stanojevic D, Li Z, Bakic S, Foo R, Sikic M Nat Commun. 2024; 15(1):5580.

PMID: 38961062 PMC: 11222435. DOI: 10.1038/s41467-024-49847-0.


A signal processing and deep learning framework for methylation detection using Oxford Nanopore sequencing.

Ahsan M, Gouru A, Chan J, Zhou W, Wang K Nat Commun. 2024; 15(1):1448.

PMID: 38365920 PMC: 10873387. DOI: 10.1038/s41467-024-45778-y.

References
1.
Lang D, Zhang S, Ren P, Liang F, Sun Z, Meng G . Comparison of the two up-to-date sequencing technologies for genome assembly: HiFi reads of Pacific Biosciences Sequel II system and ultralong reads of Oxford Nanopore. Gigascience. 2020; 9(12). PMC: 7736813. DOI: 10.1093/gigascience/giaa123. View

2.
Akbari V, Garant J, ONeill K, Pandoh P, Moore R, Marra M . Megabase-scale methylation phasing using nanopore long reads and NanoMethPhase. Genome Biol. 2021; 22(1):68. PMC: 7898412. DOI: 10.1186/s13059-021-02283-5. View

3.
Olivia Tse O, Jiang P, Cheng S, Peng W, Shang H, Wong J . Genome-wide detection of cytosine methylation by single molecule real-time sequencing. Proc Natl Acad Sci U S A. 2021; 118(5). PMC: 7865158. DOI: 10.1073/pnas.2019768118. View

4.
Liu Y, Rosikiewicz W, Pan Z, Jillette N, Wang P, Taghbalout A . DNA methylation-calling tools for Oxford Nanopore sequencing: a survey and human epigenome-wide evaluation. Genome Biol. 2021; 22(1):295. PMC: 8524990. DOI: 10.1186/s13059-021-02510-z. View

5.
Meissner A, Gnirke A, Bell G, Ramsahoye B, Lander E, Jaenisch R . Reduced representation bisulfite sequencing for comparative high-resolution DNA methylation analysis. Nucleic Acids Res. 2005; 33(18):5868-77. PMC: 1258174. DOI: 10.1093/nar/gki901. View