» Articles » PMID: 38461317

Incorporating Genetic Similarity of Auxiliary Samples into EGene Identification Under the Transfer Learning Framework

Overview
Journal J Transl Med
Publisher Biomed Central
Date 2024 Mar 9
PMID 38461317
Authors
Affiliations
Soon will be listed here.
Abstract

Background: The term eGene has been applied to define a gene whose expression level is affected by at least one independent expression quantitative trait locus (eQTL). It is both theoretically and empirically important to identify eQTLs and eGenes in genomic studies. However, standard eGene detection methods generally focus on individual cis-variants and cannot efficiently leverage useful knowledge acquired from auxiliary samples into target studies.

Methods: We propose a multilocus-based eGene identification method called TLegene by integrating shared genetic similarity information available from auxiliary studies under the statistical framework of transfer learning. We apply TLegene to eGene identification in ten TCGA cancers which have an explicit relevant tissue in the GTEx project, and learn genetic effect of variant in TCGA from GTEx. We also adopt TLegene to the Geuvadis project to evaluate its usefulness in non-cancer studies.

Results: We observed substantial genetic effect correlation of cis-variants between TCGA and GTEx for a larger number of genes. Furthermore, consistent with the results of our simulations, we found that TLegene was more powerful than existing methods and thus identified 169 distinct candidate eGenes, which was much larger than the approach that did not consider knowledge transfer across target and auxiliary studies. Previous studies and functional enrichment analyses provided empirical evidence supporting the associations of discovered eGenes, and it also showed evidence of allelic heterogeneity of gene expression. Furthermore, TLegene identified more eGenes in Geuvadis and revealed that these eGenes were mainly enriched in cells EBV transformed lymphocytes tissue.

Conclusion: Overall, TLegene represents a flexible and powerful statistical method for eGene identification through transfer learning of genetic similarity shared across auxiliary and target studies.

Citing Articles

Polygenic prediction for underrepresented populations through transfer learning by utilizing genetic similarity shared with European populations.

Zhu Y, Chen W, Zhu K, Liu Y, Huang S, Zeng P Brief Bioinform. 2025; 26(1).

PMID: 39905953 PMC: 11794457. DOI: 10.1093/bib/bbaf048.


Transfer Learning in Cancer Genetics, Mutation Detection, Gene Expression Analysis, and Syndrome Recognition.

Ashayeri H, Sobhi N, Plawiak P, Pedrammehr S, Alizadehsani R, Jafarizadeh A Cancers (Basel). 2024; 16(11).

PMID: 38893257 PMC: 11171544. DOI: 10.3390/cancers16112138.

References
1.
Deng L, Meng T, Chen L, Wei W, Wang P . The role of ubiquitination in tumorigenesis and targeted drug discovery. Signal Transduct Target Ther. 2020; 5(1):11. PMC: 7048745. DOI: 10.1038/s41392-020-0107-0. View

2.
Guo J, Bakshi A, Wang Y, Jiang L, Yengo L, Goddard M . Quantifying genetic heterogeneity between continental populations for human height and body mass index. Sci Rep. 2021; 11(1):5240. PMC: 7933291. DOI: 10.1038/s41598-021-84739-z. View

3.
Davis J, Fresard L, Knowles D, Pala M, Bustamante C, Battle A . An Efficient Multiple-Testing Adjustment for eQTL Studies that Accounts for Linkage Disequilibrium between Variants. Am J Hum Genet. 2016; 98(1):216-24. PMC: 4716687. DOI: 10.1016/j.ajhg.2015.11.021. View

4.
Wu M, Lee S, Cai T, Li Y, Boehnke M, Lin X . Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet. 2011; 89(1):82-93. PMC: 3135811. DOI: 10.1016/j.ajhg.2011.05.029. View

5.
Zeng P, Wang T, Huang S . Cis-SNPs Set Testing and PrediXcan Analysis for Gene Expression Data using Linear Mixed Models. Sci Rep. 2017; 7(1):15237. PMC: 5681585. DOI: 10.1038/s41598-017-15055-8. View