» Articles » PMID: 32397130

Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data

Abstract

Preprocessing of transcriptomics data plays a pivotal role in the development of toxicogenomics-driven tools for chemical toxicity assessment. The generation and exploitation of large volumes of molecular profiles, following an appropriate experimental design, allows the employment of toxicogenomics (TGx) approaches for a thorough characterisation of the mechanism of action (MOA) of different compounds. To date, a plethora of data preprocessing methodologies have been suggested. However, in most cases, building the optimal analytical workflow is not straightforward. A careful selection of the right tools must be carried out, since it will affect the downstream analyses and modelling approaches. Transcriptomics data preprocessing spans across multiple steps such as quality check, filtering, normalization, batch effect detection and correction. Currently, there is a lack of standard guidelines for data preprocessing in the TGx field. Defining the optimal tools and procedures to be employed in the transcriptomics data preprocessing will lead to the generation of homogeneous and unbiased data, allowing the development of more reliable, robust and accurate predictive models. In this review, we outline methods for the preprocessing of three main transcriptomic technologies including microarray, bulk RNA-Sequencing (RNA-Seq), and single cell RNA-Sequencing (scRNA-Seq). Moreover, we discuss the most common methods for the identification of differentially expressed genes and to perform a functional enrichment analysis. This review is the second part of a three-article series on Transcriptomics in Toxicogenomics.

Citing Articles

Advancing chemical safety assessment through an omics-based characterization of the test system-chemical interaction.

Del Giudice G, Migliaccio G, DAlessandro N, Saarimaki L, Maia M, Annala M Front Toxicol. 2023; 5:1294780.

PMID: 38026842 PMC: 10673692. DOI: 10.3389/ftox.2023.1294780.


A curated gene and biological system annotation of adverse outcome pathways related to human health.

Saarimaki L, Fratello M, Pavel A, Korpilahde S, Leppanen J, Serra A Sci Data. 2023; 10(1):409.

PMID: 37355733 PMC: 10290716. DOI: 10.1038/s41597-023-02321-w.


Transcriptomic Analysis of Diethylstilbestrol in Daphnia Magna: Energy Metabolism and Growth Inhibition.

Li Q, Zhao Q, Guo J, Li X, Song J Toxics. 2023; 11(2).

PMID: 36851071 PMC: 9962875. DOI: 10.3390/toxics11020197.


Data-driven analysis and druggability assessment methods to accelerate the identification of novel cancer targets.

Beis G, Serafeim A, Papasotiriou I Comput Struct Biotechnol J. 2022; 21:46-57.

PMID: 36514341 PMC: 9732000. DOI: 10.1016/j.csbj.2022.11.042.


The potential of a data centred approach & knowledge graph data representation in chemical safety and drug design.

Pavel A, Saarimaki L, Mobus L, Federico A, Serra A, Greco D Comput Struct Biotechnol J. 2022; 20:4837-4849.

PMID: 36147662 PMC: 9464643. DOI: 10.1016/j.csbj.2022.08.061.


References
1.
Griffiths J, Scialdone A, Marioni J . Using single-cell genomics to understand developmental processes and cell fate decisions. Mol Syst Biol. 2018; 14(4):e8046. PMC: 5900446. DOI: 10.15252/msb.20178046. View

2.
Witten D, Tibshirani R . Scientific research in the age of omics: the good, the bad, and the sloppy. J Am Med Inform Assoc. 2012; 20(1):125-7. PMC: 3555320. DOI: 10.1136/amiajnl-2012-000972. View

3.
Considine M, Parker H, Wei Y, Xia X, Cope L, Ochs M . AGA: Interactive pipeline for reproducible genomics analyses. F1000Res. 2017; 4:28. PMC: 4617321. DOI: 10.12688/f1000research.6030.1. View

4.
Ilicic T, Kim J, Kolodziejczyk A, Bagger F, McCarthy D, Marioni J . Classification of low quality cells from single-cell RNA-seq data. Genome Biol. 2016; 17:29. PMC: 4758103. DOI: 10.1186/s13059-016-0888-1. View

5.
Liu R, Holik A, Su S, Jansz N, Chen K, Leong H . Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses. Nucleic Acids Res. 2015; 43(15):e97. PMC: 4551905. DOI: 10.1093/nar/gkv412. View