» Articles » PMID: 27535739

Integrative Regression Network for Genomic Association Study

Overview
Publisher Biomed Central
Specialty Genetics
Date 2016 Aug 19
PMID 27535739
Citations 4
Authors
Affiliations
Soon will be listed here.
Abstract

Background: The increasing availability of multiple types of genomic profiles measured from the same cancer patients has provided numerous opportunities for investigating genomic mechanisms underlying cancer. In particular, association studies of gene expression traits with respect to multi-layered genomic features are highly useful for uncovering the underlying mechanism. Conventional correlation-based association tests are limited because they are prone to revealing indirect associations. Moreover, integration of multiple types of genomic features raises another challenge.

Methods: In this study, we propose a new framework for association studies called integrative regression network that identifies genomic associations on multiple high-dimensional genomic profiles by taking into account the associations between as well as within profiles. We employed high-dimensional regression techniques to first identify the associations between different genomic profiles. Based on the resulting regression coefficients, a regression network was constructed within each profile. For example, two methylation features having similar regression coefficients with respect to a number of gene expression traits are likely to be involved in the same biological process and therefore we define an edge between two methylation features in the regression network. To extract more reliable associations, multiple sparse structured regression techniques were applied and the resulting multiple networks were merged as the integrative regression network using a similarity network fusion technique.

Results: Experiments were carried out using four different sparse structured regression methods on five cancer types from TCGA. The advantages and disadvantages of each regression method were also explored. We find there was large inconsistency in the results from different regression methods, which supports the need to extract the proposed integrative regression network from multiple complimentary regression techniques. Fusing multiple regression networks by using similarity measurements led to the identification of significant gene pairs and a resulting network with better topological properties.

Conclusions: We developed and validated the integrative regression network scheme on multi-layered genomic profiles from TCGA. Our method facilitates identification of the strong signals as well as weaker signals by fusing information from different regression techniques. It could be extended to integrate results obtained from different cancer types as well.

Citing Articles

A multivariable approach for risk markers from pooled molecular data with only partial overlap.

Stelzer A, Maccioni L, Gerhold-Ay A, Smedby K, Schumacher M, Nieters A BMC Med Genet. 2019; 20(1):128.

PMID: 31324155 PMC: 6642584. DOI: 10.1186/s12881-019-0849-0.


Topological integration of RPPA proteomic data with multi-omics data for survival prediction in breast cancer via pathway activity inference.

Kim T, Jeong H, Sohn K BMC Med Genomics. 2019; 12(Suppl 5):94.

PMID: 31296204 PMC: 6624183. DOI: 10.1186/s12920-019-0511-x.


Robust pathway-based multi-omics data integration using directed random walks for survival prediction in multiple cancer studies.

Kim S, Jeong H, Kim J, Moon J, Sohn K Biol Direct. 2019; 14(1):8.

PMID: 31036036 PMC: 6489180. DOI: 10.1186/s13062-019-0239-8.


Identifying subtype-specific associations between gene expression and DNA methylation profiles in breast cancer.

Lee G, Bang L, Kim S, Kim D, Sohn K BMC Med Genomics. 2017; 10(Suppl 1):28.

PMID: 28589855 PMC: 5461552. DOI: 10.1186/s12920-017-0268-z.

References
1.
Friedman J, Hastie T, Tibshirani R . Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw. 2010; 33(1):1-22. PMC: 2929880. View

2.
Marttinen P, Gillberg J, Havulinna A, Corander J, Kaski S . Genome-wide association studies with high-dimensional phenotypes. Stat Appl Genet Mol Biol. 2013; 12(4):413-31. DOI: 10.1515/sagmb-2012-0032. View

3.
Huang D, Sherman B, Lempicki R . Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 2008; 37(1):1-13. PMC: 2615629. DOI: 10.1093/nar/gkn923. View

4.
Yang D, Sun Y, Hu L, Zheng H, Ji P, Pecot C . Integrated analyses identify a master microRNA regulatory network for the mesenchymal subtype in serous ovarian cancer. Cancer Cell. 2013; 23(2):186-99. PMC: 3603369. DOI: 10.1016/j.ccr.2012.12.020. View

5.
Sohn K, Kim D, Lim J, Kim J . Relative impact of multi-layered genomic data on gene expression phenotypes in serous ovarian tumors. BMC Syst Biol. 2014; 7 Suppl 6:S9. PMC: 3906601. DOI: 10.1186/1752-0509-7-S6-S9. View