IsoTools 2.0: Software for Comprehensive Analysis of Long-read Transcriptome Sequencing Data
Overview
Molecular Biology
Affiliations
Direct, single molecule measurement of RNA by long-read transcriptome sequencing (LRTS) enables the reliable detection of transcripts and alternative splicing events, thus contributing to the identification of splicing mechanisms, improvement of current gene models, as well as to the prediction of more reliable protein isoforms. LRTS data comes from either PacBio's single-molecule real time sequencing or from Oxford Nanopore's nanopore sequencing. Previously, we developed IsoTools, a software originally designed for processing and analyzing PacBio data. IsoTools copes with the complexity of LRTS data and offers multiple functionality for transcript identification and quantification as well as the analysis of differential isoform usage and local differential splicing events. Here, we report an update of the software, IsoTools 2.0, and demonstrate its additional performance on Oxford Nanopore data from multiple experimental protocols. We present the IsoTools 2.0 workflow, highlighting novel functionalities with respect to reliable transcript detection as well as transcription start site prediction. Additionally, we show novel metrics for structural description and quantification of gene model variability based on the gene's transcripts. We demonstrate the performance of IsoTools 2.0 on a variety of experimental protocols for library construction from a recent LRTS challenge. We show that IsoTools 2.0 is able to cope with the inherent complexity of LRTS data and that the workflow generates meaningful hypotheses on biomarkers for alternative splicing. The software is available from https://github.com/HerwigLab/IsoTools2/.