» Articles » PMID: 36513375

Incomplete Time-series Gene Expression in Integrative Study for Islet Autoimmunity Prediction

Overview
Journal Brief Bioinform
Specialty Biology
Date 2022 Dec 13
PMID 36513375
Authors
Affiliations
Soon will be listed here.
Abstract

Type 1 diabetes (T1D) outcome prediction plays a vital role in identifying novel risk factors, ensuring early patient care and designing cohort studies. TEDDY is a longitudinal cohort study that collects a vast amount of multi-omics and clinical data from its participants to explore the progression and markers of T1D. However, missing data in the omics profiles make the outcome prediction a difficult task. TEDDY collected time series gene expression for less than 6% of enrolled participants. Additionally, for the participants whose gene expressions are collected, 79% time steps are missing. This study introduces an advanced bioinformatics framework for gene expression imputation and islet autoimmunity (IA) prediction. The imputation model generates synthetic data for participants with partially or entirely missing gene expression. The prediction model integrates the synthetic gene expression with other risk factors to achieve better predictive performance. Comprehensive experiments on TEDDY datasets show that: (1) Our pipeline can effectively integrate synthetic gene expression with family history, HLA genotype and SNPs to better predict IA status at 2 years (sensitivity 0.622, AUC 0.715) compared with the individual datasets and state-of-the-art results in the literature (AUC 0.682). (2) The synthetic gene expression contains predictive signals as strong as the true gene expression, reducing reliance on expensive and long-term longitudinal data collection. (3) Time series gene expression is crucial to the proposed improvement and shows significantly better predictive ability than cross-sectional gene expression. (4) Our pipeline is robust to limited data availability. Availability: Code is available at https://github.com/compbiolabucf/TEDDY.

Citing Articles

Optimizing multi-omics data imputation with NMF and GAN synergy.

Ansari M, Ahmed K, Zhang W Bioinformatics. 2024; 40(11).

PMID: 39546381 PMC: 11639186. DOI: 10.1093/bioinformatics/btae674.


A Concerted Vision to Advance the Knowledge of Diabetes Mellitus Related to Immune Checkpoint Inhibitors.

Deligiorgi M, Trafalis D Int J Mol Sci. 2023; 24(8).

PMID: 37108792 PMC: 10146255. DOI: 10.3390/ijms24087630.

References
1.
Steck A, Vehik K, Bonifacio E, Lernmark A, Ziegler A, Hagopian W . Predictors of Progression From the Appearance of Islet Autoantibodies to Early Childhood Diabetes: The Environmental Determinants of Diabetes in the Young (TEDDY). Diabetes Care. 2015; 38(5):808-13. PMC: 4407751. DOI: 10.2337/dc14-2426. View

2.
Xhonneux L, Knight O, Lernmark A, Bonifacio E, Hagopian W, Rewers M . Transcriptional networks in at-risk individuals identify signatures of type 1 diabetes progression. Sci Transl Med. 2021; 13(587). PMC: 8447843. DOI: 10.1126/scitranslmed.abd5666. View

3.
Zhou X, Chai H, Zhao H, Luo C, Yang Y . Imputing missing RNA-sequencing data from DNA methylation by using a transfer learning-based neural network. Gigascience. 2020; 9(7). PMC: 7350980. DOI: 10.1093/gigascience/giaa076. View

4.
Su R, Liu X, Wei L, Zou Q . Deep-Resp-Forest: A deep forest model to predict anti-cancer drug response. Methods. 2019; 166:91-102. DOI: 10.1016/j.ymeth.2019.02.009. View

5.
Sosenko J, Palmer J, Rafkin-Mervis L, Krischer J, Cuthbertson D, Matheson D . Glucose and C-peptide changes in the perionset period of type 1 diabetes in the Diabetes Prevention Trial-Type 1. Diabetes Care. 2008; 31(11):2188-92. PMC: 2571043. DOI: 10.2337/dc08-0935. View