» Articles » PMID: 28190455

Who's Who? Detecting and Resolving Sample Anomalies in Human DNA Sequencing Studies with Peddy

Overview
Journal Am J Hum Genet
Publisher Cell Press
Specialty Genetics
Date 2017 Feb 14
PMID 28190455
Citations 132
Authors
Affiliations
Soon will be listed here.
Abstract

The potential for genetic discovery in human DNA sequencing studies is greatly diminished if DNA samples from a cohort are mislabeled, swapped, or contaminated or if they include unintended individuals. Unfortunately, the potential for such errors is significant since DNA samples are often manipulated by several protocols, labs, or scientists in the process of sequencing. We have developed a software package, peddy, to identify and facilitate the remediation of such errors via interactive visualizations and reports comparing the stated sex, relatedness, and ancestry to what is inferred from the individual genotypes derived from whole-genome (WGS) or whole-exome (WES) sequencing. Peddy predicts a sample's ancestry using a machine learning model trained on individuals of diverse ancestries from the 1000 Genomes Project reference panel. Peddy facilitates both automated and interactive, visual detection of sample swaps, poor sequencing quality, and other indicators of sample problems that, if left undetected, would inhibit discovery.

Citing Articles

Identification of plasma proteomic markers underlying polygenic risk of type 2 diabetes and related comorbidities.

Loesch D, Garg M, Matelska D, Vitsios D, Jiang X, Ritchie S Nat Commun. 2025; 16(1):2124.

PMID: 40032831 PMC: 11876343. DOI: 10.1038/s41467-025-56695-z.


Analysis of Short Tandem Repeat Expansions in a Cohort of 12,496 Exomes from Patients with Neurological Diseases Reveals Variable Genotyping Rate Dependent on Exome Capture Kits.

Rocca C, Murphy D, Clarkson C, Zanovello M, Gagliardi D, Genomics Q Genes (Basel). 2025; 16(2).

PMID: 40004498 PMC: 11855749. DOI: 10.3390/genes16020169.


Fetal genetic factors in pregnancy loss: Insights from a meta-analysis and effectiveness of whole exome sequencing.

Hadjipanteli A, Theodosiou A, Papaevripidou I, Alexandrou A, Salameh N, Evangelidou P PLoS One. 2025; 20(2):e0319052.

PMID: 39999070 PMC: 11856309. DOI: 10.1371/journal.pone.0319052.


Assessing the contribution of rare protein-coding germline variants to prostate cancer risk and severity in 37,184 cases.

Mitchell J, Camacho N, Shea P, Stopsack K, Joseph V, Burren O Nat Commun. 2025; 16(1):1779.

PMID: 39971927 PMC: 11839991. DOI: 10.1038/s41467-025-56944-1.


Comparative analysis of the Mexico City Prospective Study and the UK Biobank identifies ancestry-specific effects on clonal hematopoiesis.

Wen S, Kuri-Morales P, Hu F, Nag A, Tachmazidou I, Deevi S Nat Genet. 2025; 57(3):572-582.

PMID: 39948438 PMC: 11906367. DOI: 10.1038/s41588-025-02085-6.


References
1.
Auton A, Brooks L, Durbin R, Garrison E, Kang H, Korbel J . A global reference for human genetic variation. Nature. 2015; 526(7571):68-74. PMC: 4750478. DOI: 10.1038/nature15393. View

2.
Miller C, Qiao Y, DiSera T, DAstous B, Marth G . bam.iobio: a web-based, real-time, sequence alignment file inspector. Nat Methods. 2014; 11(12):1189. PMC: 4282680. DOI: 10.1038/nmeth.3174. View

3.
Blue E, Brown L, Conomos M, Kirk J, Nato Jr A, Popejoy A . Estimating relationships between phenotypes and subjects drawn from admixed families. BMC Proc. 2016; 10(Suppl 7):357-362. PMC: 5133521. DOI: 10.1186/s12919-016-0056-3. View

4.
Eberle M, Fritzilas E, Krusche P, Kallberg M, Moore B, Bekritsky M . A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree. Genome Res. 2016; 27(1):157-164. PMC: 5204340. DOI: 10.1101/gr.210500.116. View

5.
Boada R, Janusz J, Hutaff-Lee C, Tartaglia N . The cognitive phenotype in Klinefelter syndrome: a review of the literature including genetic and hormonal factors. Dev Disabil Res Rev. 2009; 15(4):284-94. PMC: 3056507. DOI: 10.1002/ddrr.83. View