» Articles » PMID: 20169069

Gene Expression Prediction by Soft Integration and the Elastic Net-best Performance of the DREAM3 Gene Expression Challenge

Overview
Journal PLoS One
Date 2010 Feb 20
PMID 20169069
Citations 19
Authors
Affiliations
Soon will be listed here.
Abstract

Background: To predict gene expressions is an important endeavour within computational systems biology. It can both be a way to explore how drugs affect the system, as well as providing a framework for finding which genes are interrelated in a certain process. A practical problem, however, is how to assess and discriminate among the various algorithms which have been developed for this purpose. Therefore, the DREAM project invited the year 2008 to a challenge for predicting gene expression values, and here we present the algorithm with best performance.

Methodology/principal Findings: We develop an algorithm by exploring various regression schemes with different model selection procedures. It turns out that the most effective scheme is based on least squares, with a penalty term of a recently developed form called the "elastic net". Key components in the algorithm are the integration of expression data from other experimental conditions than those presented for the challenge and the utilization of transcription factor binding data for guiding the inference process towards known interactions. Of importance is also a cross-validation procedure where each form of external data is used only to the extent it increases the expected performance.

Conclusions/significance: Our algorithm proves both the possibility to extract information from large-scale expression data concerning prediction of gene levels, as well as the benefits of integrating different data sources for improving the inference. We believe the former is an important message to those still hesitating on the possibilities for computational approaches, while the latter is part of an important way forward for the future development of the field of computational systems biology.

Citing Articles

Network reconstruction for trans acting genetic loci using multi-omics data and prior information.

Hawe J, Saha A, Waldenberger M, Kunze S, Wahl S, Muller-Nurasyid M Genome Med. 2022; 14(1):125.

PMID: 36344995 PMC: 9641770. DOI: 10.1186/s13073-022-01124-9.


Inference of phenotype-relevant transcriptional regulatory networks elucidates cancer type-specific regulatory mechanisms in a pan-cancer study.

Emad A, Sinha S NPJ Syst Biol Appl. 2021; 7(1):9.

PMID: 33558504 PMC: 7870953. DOI: 10.1038/s41540-021-00169-7.


Timepoint Selection Strategy for In Vivo Proteome Dynamics from Heavy Water Metabolic Labeling and LC-MS.

Sadygov V, Zhang W, Sadygov R J Proteome Res. 2020; 19(5):2105-2112.

PMID: 32183509 PMC: 8864836. DOI: 10.1021/acs.jproteome.0c00023.


Integration of Multiple Data Sources for Gene Network Inference Using Genetic Perturbation Data.

Liang X, Chad Young W, Hung L, Raftery A, Yeung K J Comput Biol. 2019; 26(10):1113-1129.

PMID: 31009236 PMC: 6786343. DOI: 10.1089/cmb.2019.0036.


Global transcriptional regulatory network for robustly connects gene expression to transcription factor activities.

Fang X, Sastry A, Mih N, Kim D, Tan J, Yurkovich J Proc Natl Acad Sci U S A. 2017; 114(38):10286-10291.

PMID: 28874552 PMC: 5617254. DOI: 10.1073/pnas.1702581114.


References
1.
di Bernardo D, Thompson M, Gardner T, Chobot S, Eastwood E, Wojtovich A . Chemogenomic profiling on a genome-wide scale using reverse-engineered gene networks. Nat Biotechnol. 2005; 23(3):377-83. DOI: 10.1038/nbt1075. View

2.
de Jong H . Modeling and simulation of genetic regulatory systems: a literature review. J Comput Biol. 2002; 9(1):67-103. DOI: 10.1089/10665270252833208. View

3.
Friedman J, Hastie T, Tibshirani R . Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw. 2010; 33(1):1-22. PMC: 2929880. View

4.
Teixeira M, Monteiro P, Jain P, Tenreiro S, Fernandes A, Mira N . The YEASTRACT database: a tool for the analysis of transcription regulatory associations in Saccharomyces cerevisiae. Nucleic Acids Res. 2005; 34(Database issue):D446-51. PMC: 1347376. DOI: 10.1093/nar/gkj013. View

5.
Edgar R, Domrachev M, Lash A . Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2001; 30(1):207-10. PMC: 99122. DOI: 10.1093/nar/30.1.207. View