» Articles » PMID: 40060369

BacTermFinder: a Comprehensive and General Bacterial Terminator Finder Using a CNN Ensemble

Overview
Specialty Biology
Date 2025 Mar 10
PMID 40060369
Authors
Affiliations
Soon will be listed here.
Abstract

A terminator is a DNA region that ends the transcription process. Currently, multiple computational tools are available for predicting bacterial terminators. However, these methods are specialized for certain bacteria or terminator type (i.e. intrinsic or factor-dependent). In this work, we developed BacTermFinder using an ensemble of convolutional neural networks (CNNs) receiving as input four different representations of terminator sequences. To develop BacTermFinder, we collected roughly 41 000 bacterial terminators (intrinsic and factor-dependent) of 22 species with varying GC-content (from 28% to 71%) from published studies that used RNA-seq technologies. We evaluated BacTermFinder's performance on terminators of five bacterial species (not used for training BacTermFinder) and two archaeal species. BacTermFinder's performance was compared with that of four other bacterial terminator prediction tools. Based on our results, BacTermFinder outperforms all other four approaches in terms of average recall without increasing the number of false positives. Moreover, BacTermFinder identifies both types of terminators (intrinsic and factor-dependent) and generalizes to archaeal terminators. Additionally, we visualized the saliency map of the CNNs to gain insights on terminator motif per species. BacTermFinder is publicly available at https://github.com/BioinformaticsLabAtMUN/BacTermFinder.

References
1.
Lundberg S, Erion G, Chen H, DeGrave A, Prutkin J, Nair B . From Local Explanations to Global Understanding with Explainable AI for Trees. Nat Mach Intell. 2020; 2(1):56-67. PMC: 7326367. DOI: 10.1038/s42256-019-0138-9. View

2.
Ray-Soni A, Bellecourt M, Landick R . Mechanisms of Bacterial Transcription Termination: All Good Things Must End. Annu Rev Biochem. 2016; 85:319-47. DOI: 10.1146/annurev-biochem-060815-014844. View

3.
Slager J, Aprianto R, Veening J . Deep genome annotation of the opportunistic human pathogen Streptococcus pneumoniae D39. Nucleic Acids Res. 2018; 46(19):9971-9989. PMC: 6212727. DOI: 10.1093/nar/gky725. View

4.
Ishii T, Yoshida K, TERAI G, Fujita Y, Nakai K . DBTBS: a database of Bacillus subtilis promoters and transcription factors. Nucleic Acids Res. 2000; 29(1):278-80. PMC: 29858. DOI: 10.1093/nar/29.1.278. View

5.
Bar A, Argaman L, Eldar M, Margalit H . TRS: a method for determining transcript termini from RNAtag-seq sequencing data. Nat Commun. 2023; 14(1):7843. PMC: 10687069. DOI: 10.1038/s41467-023-43534-2. View