» Articles » PMID: 32943643

Evaluating the Informativeness of Deep Learning Annotations for Human Complex Diseases

Overview
Journal Nat Commun
Specialty Biology
Date 2020 Sep 18
PMID 32943643
Citations 19
Authors
Affiliations
Soon will be listed here.
Abstract

Deep learning models have shown great promise in predicting regulatory effects from DNA sequence, but their informativeness for human complex diseases is not fully understood. Here, we evaluate genome-wide SNP annotations from two previous deep learning models, DeepSEA and Basenji, by applying stratified LD score regression to 41 diseases and traits (average N = 320K), conditioning on a broad set of coding, conserved and regulatory annotations. We aggregated annotations across all (respectively blood or brain) tissues/cell-types in meta-analyses across all (respectively 11 blood or 8 brain) traits. The annotations were highly enriched for disease heritability, but produced only limited conditionally significant results: non-tissue-specific and brain-specific Basenji-H3K4me3 for all traits and brain traits respectively. We conclude that deep learning models have yet to achieve their full potential to provide considerable unique information for complex disease, and that their conditional informativeness for disease cannot be inferred from their accuracy in predicting regulatory annotations.

Citing Articles

Benchmarking DNA Sequence Models for Causal Regulatory Variant Prediction in Human Genetics.

Benegas G, Eraslan G, Song Y bioRxiv. 2025; .

PMID: 39990426 PMC: 11844472. DOI: 10.1101/2025.02.11.637758.


Iterative improvement of deep learning models using synthetic regulatory genomics.

Ribeiro-Dos-Santos A, Maurano M bioRxiv. 2025; .

PMID: 39974895 PMC: 11838587. DOI: 10.1101/2025.02.04.636130.


Deciphering the tissue-specific functional effect of Alzheimer risk SNPs with deep genome annotation.

Pugalenthi P, He B, Xie L, Nho K, Saykin A, Yan J BioData Min. 2024; 17(1):50.

PMID: 39538253 PMC: 11558841. DOI: 10.1186/s13040-024-00400-1.


Current genomic deep learning models display decreased performance in cell type-specific accessible regions.

Kathail P, Shuai R, Chung R, Ye C, Loeb G, Ioannidis N Genome Biol. 2024; 25(1):202.

PMID: 39090688 PMC: 11293111. DOI: 10.1186/s13059-024-03335-2.


Cross-Species Prediction of Transcription Factor Binding by Adversarial Training of a Novel Nucleotide-Level Deep Neural Network.

Zhang Q, Wang S, Li Z, Pan Y, Huang D Adv Sci (Weinh). 2024; 11(36):e2405685.

PMID: 39076052 PMC: 11423150. DOI: 10.1002/advs.202405685.


References
1.
Maurano M, Humbert R, Rynes E, Thurman R, Haugen E, Wang H . Systematic localization of common disease-associated variation in regulatory DNA. Science. 2012; 337(6099):1190-5. PMC: 3771521. DOI: 10.1126/science.1222794. View

2.
Trynka G, Sandor C, Han B, Xu H, Stranger B, Liu X . Chromatin marks identify critical cell types for fine mapping complex trait variants. Nat Genet. 2012; 45(2):124-30. PMC: 3826950. DOI: 10.1038/ng.2504. View

3.
Pickrell J . Joint analysis of functional genomic data and genome-wide association studies of 18 human traits. Am J Hum Genet. 2014; 94(4):559-73. PMC: 3980523. DOI: 10.1016/j.ajhg.2014.03.004. View

4.
Finucane H, Bulik-Sullivan B, Gusev A, Trynka G, Reshef Y, Loh P . Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat Genet. 2015; 47(11):1228-35. PMC: 4626285. DOI: 10.1038/ng.3404. View

5.
Visscher P, Wray N, Zhang Q, Sklar P, McCarthy M, Brown M . 10 Years of GWAS Discovery: Biology, Function, and Translation. Am J Hum Genet. 2017; 101(1):5-22. PMC: 5501872. DOI: 10.1016/j.ajhg.2017.06.005. View