TransTEx: Novel Tissue-specificity Scoring Method for Grouping Human Transcriptome into Different Expression Groups

Overview

Journal Bioinformatics

Publisher Oxford University Press

Specialty Biology

Date 2024 Aug 9

PMID 39120880

Authors

Pallavi Surana

Pratik Dutta

Ramana V Davuluri

Affiliations

Soon will be listed here.

Abstract

Motivation: Although human tissues carry out common molecular processes, gene expression patterns can distinguish different tissues. Traditional informatics methods, primarily at the gene level, overlook the complexity of alternative transcript variants and protein isoforms produced by most genes, changes in which are linked to disease prognosis and drug resistance.

Results: We developed TransTEx (Transcript-level Tissue Expression), a novel tissue-specificity scoring method, for grouping transcripts into four expression groups. TransTEx applies sequential cut-offs to tissue-wise transcript probability estimates, subsampling-based P-values and fold-change estimates. Application of TransTEx on GTEx mRNA-seq data divided 199 166 human transcripts into different groups as 17 999 tissue-specific (TSp), 7436 tissue-enhanced, 36 783 widely expressed (Wide), 79 191 lowly expressed (Low), and 57 757 no expression (Null) transcripts. Testis has the most (13 466) TSp isoforms followed by liver (890), brain (701), pituitary (435), and muscle (420). We found that the tissue specificity of alternative transcripts of a gene is predominantly influenced by alternate promoter usage. By overlapping brain-specific transcripts with the cell-type gene-markers in scBrainMap database, we found that 63% of the brain-specific transcripts were enriched in nonneuronal cell types, predominantly astrocytes followed by endothelial cells and oligodendrocytes. In addition, we found 61 brain cell-type marker genes encoding a total of 176 alternative transcripts as brain-specific and 22 alternative transcripts as testis-specific, highlighting the complex TSp and cell-type specific gene regulation and expression at isoform-level. TransTEx can be adopted to the analysis of bulk RNA-seq or scRNA-seq datasets to find tissue- and/or cell-type specific isoform-level gene markers.

Availability And Implementation: TransTEx database: https://bmi.cewit.stonybrook.edu/transtexdb/ and the R package is available via GitHub: https://github.com/pallavisurana1/TransTEx.

References

Barshir R, Fishilevich S, Iny-Stein T, Zelig O, Mazor Y, Guan-Golan Y . GeneCaRNA: A Comprehensive Gene-centric Database of Human Non-coding RNAs in the GeneCards Suite. J Mol Biol. 2021; 433(11):166913. DOI: 10.1016/j.jmb.2021.166913. View

Wu T, Hu E, Xu S, Chen M, Guo P, Dai Z . clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. Innovation (Camb). 2021; 2(3):100141. PMC: 8454663. DOI: 10.1016/j.xinn.2021.100141. View

Upadhya S, Ryan C . Experimental reproducibility limits the correlation between mRNA and protein abundances in tumor proteomic profiles. Cell Rep Methods. 2022; 2(9):100288. PMC: 9499981. DOI: 10.1016/j.crmeth.2022.100288. View

Djureinovic D, Fagerberg L, Hallstrom B, Danielsson A, Lindskog C, Uhlen M . The human testis-specific proteome defined by transcriptomics and antibody-based profiling. Mol Hum Reprod. 2014; 20(6):476-88. DOI: 10.1093/molehr/gau018. View

Duffy A, Verbanck M, Dobbyn A, Won H, Rein J, Forrest I . Tissue-specific genetic features inform prediction of drug side effects in clinical trials. Sci Adv. 2020; 6(37). PMC: 11206454. DOI: 10.1126/sciadv.abb6242. View

Ponten F, Jirstrom K, Uhlen M . The Human Protein Atlas--a tool for pathology. J Pathol. 2008; 216(4):387-93. DOI: 10.1002/path.2440. View

Sonawane A, Platig J, Fagny M, Chen C, Paulson J, Lopes-Ramos C . Understanding Tissue-Specific Gene Regulation. Cell Rep. 2017; 21(4):1077-1088. PMC: 5828531. DOI: 10.1016/j.celrep.2017.10.001. View

Wang E, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C . Alternative isoform regulation in human tissue transcriptomes. Nature. 2008; 456(7221):470-6. PMC: 2593745. DOI: 10.1038/nature07509. View

Feng J, Meyer C, Wang Q, Liu J, Liu X, Zhang Y . GFOLD: a generalized fold change for ranking differentially expressed genes from RNA-seq data. Bioinformatics. 2012; 28(21):2782-8. DOI: 10.1093/bioinformatics/bts515. View

10.

Jiang W, Chen L . Tissue Specificity of Gene Expression Evolves Across Mammal Species. J Comput Biol. 2022; 29(8):880-891. PMC: 9464367. DOI: 10.1089/cmb.2021.0592. View

11.

Jiang C, Li Y, Zhao Z, Lu J, Chen H, Ding N . Identifying and functionally characterizing tissue-specific and ubiquitously expressed human lncRNAs. Oncotarget. 2016; 7(6):7120-33. PMC: 4872773. DOI: 10.18632/oncotarget.6859. View

12.

Jacox E, Gotea V, Ovcharenko I, Elnitski L . Tissue-specific and ubiquitous expression patterns from alternative promoters of human genes. PLoS One. 2010; 5(8):e12274. PMC: 2923625. DOI: 10.1371/journal.pone.0012274. View

13.

Kern C, Wang Y, Chitwood J, Korf I, Delany M, Cheng H . Genome-wide identification of tissue-specific long non-coding RNA in three farm animal species. BMC Genomics. 2018; 19(1):684. PMC: 6145346. DOI: 10.1186/s12864-018-5037-7. View

14.

Zhu J, Chen G, Zhu S, Li S, Wen Z, Li B . Identification of Tissue-Specific Protein-Coding and Noncoding Transcripts across 14 Human Tissues Using RNA-seq. Sci Rep. 2016; 6:28400. PMC: 4916594. DOI: 10.1038/srep28400. View

15.

Moreno P, Fexova S, George N, Manning J, Miao Z, Mohammed S . Expression Atlas update: gene and protein expression in multiple species. Nucleic Acids Res. 2021; 50(D1):D129-D140. PMC: 8728300. DOI: 10.1093/nar/gkab1030. View

16.

Lawrence M, Huber W, Pages H, Aboyoun P, Carlson M, Gentleman R . Software for computing and annotating genomic ranges. PLoS Comput Biol. 2013; 9(8):e1003118. PMC: 3738458. DOI: 10.1371/journal.pcbi.1003118. View

17.

Greene C, Krishnan A, Wong A, Ricciotti E, Zelaya R, Himmelstein D . Understanding multicellular function and disease with human tissue-specific networks. Nat Genet. 2015; 47(6):569-76. PMC: 4828725. DOI: 10.1038/ng.3259. View

18.

McKenzie A, Wang M, Hauberg M, Fullard J, Kozlenkov A, Keenan A . Brain Cell Type Specific Gene Expression and Co-expression Network Architectures. Sci Rep. 2018; 8(1):8868. PMC: 5995803. DOI: 10.1038/s41598-018-27293-5. View

19.

Sammeth M, Foissac S, Guigo R . A general definition and nomenclature for alternative splicing events. PLoS Comput Biol. 2008; 4(8):e1000147. PMC: 2467475. DOI: 10.1371/journal.pcbi.1000147. View

20.

Jurga A, Paleczna M, Kadluczka J, Kuter K . Beyond the GFAP-Astrocyte Protein Markers in the Brain. Biomolecules. 2021; 11(9). PMC: 8468264. DOI: 10.3390/biom11091361. View