RNA-Seq Data Analysis Pipeline for Plants: Transcriptome Assembly, Alignment, and Differential Expression Analysis
Overview
Authors
Affiliations
In this chapter, we describe methods for analyzing RNA-Seq data, presented as a flow along a pipeline beginning with raw data from a sequencer and ending with an output of differentially expressed genes and their functional characterization. The first section covers de novo transcriptome assembly for organisms lacking reference genomes or for those interested in probing against the background of organism-specific transcriptomes assembled from RNA-Seq data. Section 2 covers both gene- and transcript-level quantifications, leading to the third and final section on differential expression analysis between two or more conditions. The pipeline starts with raw sequence reads, followed by quality assessment and preprocessing of the input data to ensure a robust estimate of the transcripts and their differential regulation. The preprocessed data can be inputted into the de novo transcriptome flow to assemble transcripts, functionally annotated using tools such as InterProScan or Blast2Go and then forwarded to differential expression analysis flow, or directly inputted into the differential expression analysis flow if a reference genome is available. An online repository containing sample data has also been made available, as well as custom Python scripts to modify the output of the programs within the pipeline for various downstream analyses.
Wang L, He H, Wang J, Meng Z, Wang L, Jin X Plants (Basel). 2024; 13(19).
PMID: 39409658 PMC: 11478434. DOI: 10.3390/plants13192788.
Shi S, Tang R, Hao X, Tang S, Chen W, Jiang C Plants (Basel). 2024; 13(15).
PMID: 39124145 PMC: 11314106. DOI: 10.3390/plants13152028.
Pang J, Huang C, Wang Y, Wen X, Deng P, Li T Int J Mol Sci. 2023; 24(7).
PMID: 37047699 PMC: 10094845. DOI: 10.3390/ijms24076726.