» Articles » PMID: 37084270

MatchRanges: Generating Null Hypothesis Genomic Ranges Via Covariate-matched Sampling

Overview
Journal Bioinformatics
Specialty Biology
Date 2023 Apr 21
PMID 37084270
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: Deriving biological insights from genomic data commonly requires comparing attributes of selected genomic loci to a null set of loci. The selection of this null set is non-trivial, as it requires careful consideration of potential covariates, a problem that is exacerbated by the non-uniform distribution of genomic features including genes, enhancers, and transcription factor binding sites. Propensity score-based covariate matching methods allow the selection of null sets from a pool of possible items while controlling for multiple covariates; however, existing packages do not operate on genomic data classes and can be slow for large data sets making them difficult to integrate into genomic workflows.

Results: To address this, we developed matchRanges, a propensity score-based covariate matching method for the efficient and convenient generation of matched null ranges from a set of background ranges within the Bioconductor framework.

Availability And Implementation: Package: https://bioconductor.org/packages/nullranges, Code: https://github.com/nullranges, Documentation: https://nullranges.github.io/nullranges.

Citing Articles

Response eQTLs, chromatin accessibility, and 3D chromatin structure in chondrocytes provide mechanistic insight into osteoarthritis risk.

Kramer N, Byun S, Coryell P, DCosta S, Thulson E, Kim H Cell Genom. 2025; 5(1):100738.

PMID: 39788104 PMC: 11770232. DOI: 10.1016/j.xgen.2024.100738.


Gaussian processes for time series with lead-lag effects with applications to biology data.

Mu W, Chen J, Davis E, Reed K, Phanstiel D, Love M Biometrics. 2025; 81(1).

PMID: 39775854 PMC: 11704948. DOI: 10.1093/biomtc/ujae156.


Response eQTLs, chromatin accessibility, and 3D chromatin structure in chondrocytes provide mechanistic insight into osteoarthritis risk.

Kramer N, Byun S, Coryell P, DCosta S, Thulson E, Kim H bioRxiv. 2024; .

PMID: 38952796 PMC: 11216363. DOI: 10.1101/2024.05.05.592567.


The tidyomics ecosystem: enhancing omic data analyses.

Hutchison W, Keyes T, Crowell H, Serizay J, Soneson C, Davis E Nat Methods. 2024; 21(7):1166-1170.

PMID: 38877315 DOI: 10.1038/s41592-024-02299-2.


The ecosystem: Enhancing omic data analyses.

Hutchison W, Keyes T, Crowell H, Serizay J, Soneson C, Davis E bioRxiv. 2024; .

PMID: 38826347 PMC: 11142095. DOI: 10.1101/2023.09.10.557072.


References
1.
Lawrence M, Huber W, Pages H, Aboyoun P, Carlson M, Gentleman R . Software for computing and annotating genomic ranges. PLoS Comput Biol. 2013; 9(8):e1003118. PMC: 3738458. DOI: 10.1371/journal.pcbi.1003118. View

2.
Lee S, Cook D, Lawrence M . plyranges: a grammar of genomic data transformation. Genome Biol. 2019; 20(1):4. PMC: 6320618. DOI: 10.1186/s13059-018-1597-8. View

3.
Kramer N, Davis E, Wenger C, Deoudes E, Parker S, Love M . Plotgardener: cultivating precise multi-panel figures in R. Bioinformatics. 2022; 38(7):2042-2045. PMC: 8963281. DOI: 10.1093/bioinformatics/btac057. View

4.
Gentleman R, Carey V, Bates D, Bolstad B, Dettling M, Dudoit S . Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004; 5(10):R80. PMC: 545600. DOI: 10.1186/gb-2004-5-10-r80. View

5.
Zhu Y, Hubbard R, Chubak J, Roy J, Mitra N . Core concepts in pharmacoepidemiology: Violations of the positivity assumption in the causal analysis of observational data: Consequences and statistical approaches. Pharmacoepidemiol Drug Saf. 2021; 30(11):1471-1485. PMC: 8492528. DOI: 10.1002/pds.5338. View