» Articles » PMID: 30659755

Minor Allele Frequency Thresholds Strongly Affect Population Structure Inference with Genomic Data Sets

Overview
Journal Mol Ecol Resour
Date 2019 Jan 20
PMID 30659755
Citations 115
Authors
Affiliations
Soon will be listed here.
Abstract

A common method of minimizing errors in large DNA sequence data sets is to drop variable sites with a minor allele frequency (MAF) below some specified threshold. Although widespread, this procedure has the potential to alter downstream population genetic inferences and has received relatively little rigorous analysis. Here we use simulations and an empirical single nucleotide polymorphism data set to demonstrate the impacts of MAF thresholds on inference of population structure-often the first step in analysis of population genomic data. We find that model-based inference of population structure is confounded when singletons are included in the alignment, and that both model-based and multivariate analyses infer less distinct clusters when more stringent MAF cutoffs are applied. We propose that this behaviour is caused by the combination of a drop in the total size of the data matrix and by correlations between allele frequencies and mutational age. We recommend a set of best practices for applying MAF filters in studies seeking to describe population structure with genomic data.

Citing Articles

Triangular Causality Among Pulmonary Hypertension, Sleep Disorders, and Brain Structure at the Genetic Level: A Mendelian Randomization Study Focused on the Lung-Brain Axis.

Zhang C, Su X, Zhang Y, He P, Kong X, Zhang Z Nat Sci Sleep. 2025; 17:343-356.

PMID: 40008303 PMC: 11853869. DOI: 10.2147/NSS.S495071.


Weak genetic divergence and signals of adaptation obscured by high gene flow in an economically important aquaculture species.

Calla B, Song J, Thompson N BMC Genomics. 2025; 26(1):112.

PMID: 39910466 PMC: 11796273. DOI: 10.1186/s12864-025-11259-9.


Reassessing Hybridisation in Australian Stingless Bees Using Multiple Genetic Markers.

Hereward J, Smith T, Gloag R, Brookes D, Walter G Ecol Evol. 2025; 15(2):e70912.

PMID: 39896774 PMC: 11775563. DOI: 10.1002/ece3.70912.


The Once and Future Fish: Assessing a Millennium of Atlantic Herring Exploitation Through Mixed-Stock Analysis and Ancient DNA.

Atmore L, van der Jagt I, Boilard A, Haberle S, Blevis R, Dierickx K Glob Chang Biol. 2024; 30(12):e70010.

PMID: 39723543 PMC: 11670043. DOI: 10.1111/gcb.70010.


Concordant Signal of Genetic Variation Across Marker Densities in the Desert Annual Is Linked With Timing of Winter Precipitation.

Shryock D, Le N, DeFalco L, Esque T Evol Appl. 2024; 17(12):e70046.

PMID: 39691745 PMC: 11649585. DOI: 10.1111/eva.70046.