» Articles » PMID: 35353807

Addressing the Mean-correlation Relationship in Co-expression Analysis

Overview
Specialty Biology
Date 2022 Mar 30
PMID 35353807
Authors
Affiliations
Soon will be listed here.
Abstract

Estimates of correlation between pairs of genes in co-expression analysis are commonly used to construct networks among genes using gene expression data. As previously noted, the distribution of such correlations depends on the observed expression level of the involved genes, which we refer to this as a mean-correlation relationship in RNA-seq data, both bulk and single-cell. This dependence introduces an unwanted technical bias in co-expression analysis whereby highly expressed genes are more likely to be highly correlated. Such a relationship is not observed in protein-protein interaction data, suggesting that it is not reflecting biology. Ignoring this bias can lead to missing potentially biologically relevant pairs of genes that are lowly expressed, such as transcription factors. To address this problem, we introduce spatial quantile normalization (SpQN), a method for normalizing local distributions in a correlation matrix. We show that spatial quantile normalization removes the mean-correlation relationship and corrects the expression bias in network reconstruction.

Citing Articles

Cell-type-specific mapping of enhancers and target genes from single-cell multimodal data.

Su C, Lee D, Jin P, Zhang J bioRxiv. 2024; .

PMID: 39386519 PMC: 11463474. DOI: 10.1101/2024.09.24.614814.


Massively integrated coexpression analysis reveals transcriptional regulation, evolution and cellular implications of the yeast noncanonical translatome.

Rich A, Acar O, Carvunis A Genome Biol. 2024; 25(1):183.

PMID: 38978079 PMC: 11232214. DOI: 10.1186/s13059-024-03287-7.


eQTLs identify regulatory networks and drivers of variation in the individual response to sepsis.

Burnham K, Milind N, Lee W, Kwok A, Cano-Gamez K, Mi Y Cell Genom. 2024; 4(7):100587.

PMID: 38897207 PMC: 11293594. DOI: 10.1016/j.xgen.2024.100587.


Network-based drug repurposing for schizophrenia.

Truong T, Liu Z, Panizzutti B, Kim J, Dean O, Berk M Neuropsychopharmacology. 2024; 49(6):983-992.

PMID: 38321095 PMC: 11039639. DOI: 10.1038/s41386-024-01805-6.


Cell-type-specific co-expression inference from single cell RNA-sequencing data.

Su C, Xu Z, Shan X, Cai B, Zhao H, Zhang J Nat Commun. 2023; 14(1):4846.

PMID: 37563115 PMC: 10415381. DOI: 10.1038/s41467-023-40503-7.


References
1.
Leek J, Johnson W, Parker H, Jaffe A, Storey J . The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics. 2012; 28(6):882-3. PMC: 3307112. DOI: 10.1093/bioinformatics/bts034. View

2.
Oldham M, Konopka G, Iwamoto K, Langfelder P, Kato T, Horvath S . Functional organization of the transcriptome in human brain. Nat Neurosci. 2008; 11(11):1271-82. PMC: 2756411. DOI: 10.1038/nn.2207. View

3.
Battle A, Brown C, Engelhardt B, Montgomery S . Genetic effects on gene expression across human tissues. Nature. 2017; 550(7675):204-213. PMC: 5776756. DOI: 10.1038/nature24277. View

4.
Marioni J, Mason C, Mane S, Stephens M, Gilad Y . RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 2008; 18(9):1509-17. PMC: 2527709. DOI: 10.1101/gr.079558.108. View

5.
Wang Q, Armenia J, Zhang C, Penson A, Reznik E, Zhang L . Unifying cancer and normal RNA sequencing data from different sources. Sci Data. 2018; 5:180061. PMC: 5903355. DOI: 10.1038/sdata.2018.61. View