» Articles » PMID: 35355520

Multi-omics Protein-coding Units As Massively Parallel Bayesian Networks: Empirical Validation of Causality Structure

Overview
Journal iScience
Publisher Cell Press
Date 2022 Mar 31
PMID 35355520
Authors
Affiliations
Soon will be listed here.
Abstract

In this article we use high-throughput epigenomics, transcriptomics, and proteomics data to construct fine-graded models of the "protein-coding units" gathering all transcript isoforms and chromatin accessibility peaks associated with more than 4000 genes in humans. Each protein-coding unit has the structure of a directed acyclic graph (DAG) and can be represented as a Bayesian network. The factorization of the joint probability distribution induced by the DAGs imposes a number of conditional independence relationships among the variables forming a protein-coding unit, corresponding to the missing edges in the DAGs. We show that a large fraction of these conditional independencies are indeed verified by the data. Factors driving this verification appear to be the structural and functional annotation of the transcript isoforms, as well as a notion of structural balance (or frustration-free) of the corresponding sample correlation graph, which naturally leads to reduction of correlation (and hence to independence) upon conditioning.

Citing Articles

Targeted deep learning classification and feature extraction for clinical diagnosis.

Tsai Y, Nanthakumar V, Mohammadi S, Baldwin S, Gopaluni B, Geng F iScience. 2023; 26(11):108006.

PMID: 37876820 PMC: 10590983. DOI: 10.1016/j.isci.2023.108006.


SAMBA: Structure-Learning of Aquaculture Microbiomes Using a Bayesian Approach.

Soriano B, Hafez A, Naya-Catala F, Moroni F, Moldovan R, Toxqui-Rodriguez S Genes (Basel). 2023; 14(8).

PMID: 37628701 PMC: 10454057. DOI: 10.3390/genes14081650.


Dealing with dimensionality: the application of machine learning to multi-omics data.

Feldner-Busztin D, Firbas Nisantzis P, Edmunds S, Boza G, Racimo F, Gopalakrishnan S Bioinformatics. 2023; 39(2).

PMID: 36637211 PMC: 9907220. DOI: 10.1093/bioinformatics/btad021.

References
1.
Cunningham F, Achuthan P, Akanni W, Allen J, Amode M, Armean I . Ensembl 2019. Nucleic Acids Res. 2018; 47(D1):D745-D751. PMC: 6323964. DOI: 10.1093/nar/gky1113. View

2.
Ramirez F, Ryan D, Gruning B, Bhardwaj V, Kilpert F, Richter A . deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 2016; 44(W1):W160-5. PMC: 4987876. DOI: 10.1093/nar/gkw257. View

3.
Berger B, Peng J, Singh M . Computational solutions for omics data. Nat Rev Genet. 2013; 14(5):333-46. PMC: 3966295. DOI: 10.1038/nrg3433. View

4.
Maathuis M, Colombo D, Kalisch M, Buhlmann P . Predicting causal effects in large-scale systems from observational data. Nat Methods. 2010; 7(4):247-8. DOI: 10.1038/nmeth0410-247. View

5.
Alon U . Network motifs: theory and experimental approaches. Nat Rev Genet. 2007; 8(6):450-61. DOI: 10.1038/nrg2102. View