» Articles » PMID: 17112314

An Integrative Genomic Approach to Uncover Molecular Mechanisms of Prokaryotic Traits

Overview
Specialty Biology
Date 2006 Nov 23
PMID 17112314
Citations 19
Authors
Affiliations
Soon will be listed here.
Abstract

With mounting availability of genomic and phenotypic databases, data integration and mining become increasingly challenging. While efforts have been put forward to analyze prokaryotic phenotypes, current computational technologies either lack high throughput capacity for genomic scale analysis, or are limited in their capability to integrate and mine data across different scales of biology. Consequently, simultaneous analysis of associations among genomes, phenotypes, and gene functions is prohibited. Here, we developed a high throughput computational approach, and demonstrated for the first time the feasibility of integrating large quantities of prokaryotic phenotypes along with genomic datasets for mining across multiple scales of biology (protein domains, pathways, molecular functions, and cellular processes). Applying this method over 59 fully sequenced prokaryotic species, we identified genetic basis and molecular mechanisms underlying the phenotypes in bacteria. We identified 3,711 significant correlations between 1,499 distinct Pfam and 63 phenotypes, with 2,650 correlations and 1,061 anti-correlations. Manual evaluation of a random sample of these significant correlations showed a minimal precision of 30% (95% confidence interval: 20%-42%; n = 50). We stratified the most significant 478 predictions and subjected 100 to manual evaluation, of which 60 were corroborated in the literature. We furthermore unveiled 10 significant correlations between phenotypes and KEGG pathways, eight of which were corroborated in the evaluation, and 309 significant correlations between phenotypes and 166 GO concepts evaluated using a random sample (minimal precision = 72%; 95% confidence interval: 60%-80%; n = 50). Additionally, we conducted a novel large-scale phenomic visualization analysis to provide insight into the modular nature of common molecular mechanisms spanning multiple biological scales and reused by related phenotypes (metaphenotypes). We propose that this method elucidates which classes of molecular mechanisms are associated with phenotypes or metaphenotypes and holds promise in facilitating a computable systems biology approach to genomic and biomedical research.

Citing Articles

From Genomes to Phenotypes: Traitar, the Microbial Trait Analyzer.

Weimann A, Mooren K, Frank J, Pope P, Bremges A, McHardy A mSystems. 2017; 1(6).

PMID: 28066816 PMC: 5192078. DOI: 10.1128/mSystems.00101-16.


The landscape of microbial phenotypic traits and associated genes.

Brbic M, Piskorec M, Vidulin V, Krisko A, Smuc T, Supek F Nucleic Acids Res. 2016; 44(21):10074-10090.

PMID: 27915291 PMC: 5137458. DOI: 10.1093/nar/gkw964.


iTAR: a web server for identifying target genes of transcription factors using ChIP-seq or ChIP-chip data.

Yang C, Andrews E, Chen M, Wang W, Chen J, Gerstein M BMC Genomics. 2016; 17(1):632.

PMID: 27519564 PMC: 4983039. DOI: 10.1186/s12864-016-2963-0.


COPD Hospitalization Risk Increased with Distinct Patterns of Multiple Systems Comorbidities Unveiled by Network Modeling.

Lee Y, Boyd A, Li J, Gardeux V, Kenost C, Saner D AMIA Annu Symp Proc. 2015; 2014:855-64.

PMID: 25954392 PMC: 4419951.


Explaining microbial phenotypes on a genomic scale: GWAS for microbes.

Dutilh B, Backus L, Edwards R, Wels M, Bayjanov J, van Hijum S Brief Funct Genomics. 2013; 12(4):366-80.

PMID: 23625995 PMC: 3743258. DOI: 10.1093/bfgp/elt008.


References
1.
Makarova K, Wolf Y, Koonin E . Potential genomic determinants of hyperthermophily. Trends Genet. 2003; 19(4):172-6. DOI: 10.1016/S0168-9525(03)00047-7. View

2.
Wang J, Williams R, Manly K . WebQTL: web-based complex trait analysis. Neuroinformatics. 2004; 1(4):299-308. DOI: 10.1385/NI:1:4:299. View

3.
Goh C, Gianoulis T, Liu Y, Li J, Paccanaro A, Lussier Y . Integration of curated databases to identify genotype-phenotype associations. BMC Genomics. 2006; 7:257. PMC: 1630430. DOI: 10.1186/1471-2164-7-257. View

4.
de la Cruz N, Bromberg S, Pasko D, Shimoyama M, Twigger S, Chen J . The Rat Genome Database (RGD): developments towards a phenome database. Nucleic Acids Res. 2004; 33(Database issue):D485-91. PMC: 540004. DOI: 10.1093/nar/gki050. View

5.
Sussman J, Lin D, Jiang J, Manning N, Prilusky J, Ritter O . Protein Data Bank (PDB): database of three-dimensional structural information of biological macromolecules. Acta Crystallogr D Biol Crystallogr. 1999; 54(Pt 6 Pt 1):1078-84. DOI: 10.1107/s0907444998009378. View