Information Theoretic Approaches for Inference of Biological Networks from Continuous-valued Data
Overview
Authors
Affiliations
Background: Characterising programs of gene regulation by studying individual protein-DNA and protein-protein interactions would require a large volume of high-resolution proteomics data, and such data are not yet available. Instead, many gene regulatory network (GRN) techniques have been developed, which leverage the wealth of transcriptomic data generated by recent consortia to study indirect, gene-level relationships between transcriptional regulators. Despite the popularity of such methods, previous methods of GRN inference exhibit limitations that we highlight and address through the lens of information theory.
Results: We introduce new model-free and non-linear information theoretic measures for the inference of GRNs and other biological networks from continuous-valued data. Although previous tools have implemented mutual information as a means of inferring pairwise associations, they either introduce statistical bias through discretisation or are limited to modelling undirected relationships. Our approach overcomes both of these limitations, as demonstrated by a substantial improvement in empirical performance for a set of 160 GRNs of varying size and topology.
Conclusions: The information theoretic measures described in this study yield substantial improvements over previous approaches (e.g. ARACNE) and have been implemented in the latest release of NAIL (Network Analysis and Inference Library). However, despite the theoretical and empirical advantages of these new measures, they do not circumvent the fundamental limitation of indeterminacy exhibited across this class of biological networks. These methods have presently found value in computational neurobiology, and will likely gain traction for GRN analysis as the volume and quality of temporal transcriptomics data continues to improve.
Roth C, Venu V, Job V, Lubbers N, Sanbonmatsu K, Steadman C BMC Bioinformatics. 2023; 24(1):441.
PMID: 37990143 PMC: 10664258. DOI: 10.1186/s12859-023-05553-0.
Predicting gene regulatory links from single-cell RNA-seq data using graph neural networks.
Mao G, Pang Z, Zuo K, Wang Q, Pei X, Chen X Brief Bioinform. 2023; 24(6).
PMID: 37985457 PMC: 10661972. DOI: 10.1093/bib/bbad414.
MICFuzzy: A maximal information content based fuzzy approach for reconstructing genetic networks.
Nakulugamuwa Gamage H, Chetty M, Lim S, Hallinan J PLoS One. 2023; 18(7):e0288174.
PMID: 37418430 PMC: 10328247. DOI: 10.1371/journal.pone.0288174.
Kendall transformation brings a robust categorical representation of ordinal data.
Kursa M Sci Rep. 2022; 12(1):8341.
PMID: 35585217 PMC: 9117319. DOI: 10.1038/s41598-022-12224-2.
Novelli L, Lizier J Netw Neurosci. 2021; 5(2):373-404.
PMID: 34189370 PMC: 8233116. DOI: 10.1162/netn_a_00178.