BASAL: a Universal Mapping Algorithm for Nucleotide Base-conversion Sequencing
Overview
Affiliations
Utilizing base-conversion (BC) techniques, single-base resolution profiling of RNA and DNA modifications has significantly advanced. BC strategies range from one-way conversions (e.g. cytosine-to-thymine for 5-methylcytosine, adenine-to-guanine for N6-methyladenosine), to multi-way conversions (e.g. adenine to cytosine/guanine/thymine for N1-methyladenosine) and deletion-induced conversions (e.g. pseudouridine-to-deletion). Existing sequence aligners struggle with these diverse conversions, often leading to misaligning or inefficiency. We introduce BASAL (BAse-conversion Sequencing ALigner), which leverages bit-masking technology to accurately calculate mismatch penalties and supports all BC strategies. BASAL outperforms state-of-the-art tools in both mapping accuracy and efficiency. Through simulated and real data testing, along with experimental validation, we demonstrate that BASAL excels at identifying reliable modification sites. Moreover, BASAL enhances single-cell m6A analysis, revealing cell subpopulations and a cell evolutionary direction that align with biological functions, which other aligners fall short. BASAL's versatility establishes it as a universal aligner for RNA and DNA modification sequencing, facilitating groundbreaking discoveries in epigenomics and epitranscriptomics.