» Articles » PMID: 28192439

ReGenotyper: Detecting Mislabeled Samples in Genetic Data

Overview
Journal PLoS One
Date 2017 Feb 14
PMID 28192439
Citations 10
Authors
Affiliations
Soon will be listed here.
Abstract

In high-throughput molecular profiling studies, genotype labels can be wrongly assigned at various experimental steps; the resulting mislabeled samples seriously reduce the power to detect the genetic basis of phenotypic variation. We have developed an approach to detect potential mislabeling, recover the "ideal" genotype and identify "best-matched" labels for mislabeled samples. On average, we identified 4% of samples as mislabeled in eight published datasets, highlighting the necessity of applying a "data cleaning" step before standard data analysis.

Citing Articles

Reassessing Hybridisation in Australian Stingless Bees Using Multiple Genetic Markers.

Hereward J, Smith T, Gloag R, Brookes D, Walter G Ecol Evol. 2025; 15(2):e70912.

PMID: 39896774 PMC: 11775563. DOI: 10.1002/ece3.70912.


The genetics of gene expression in a Caenorhabditis elegans multiparental recombinant inbred line population.

Snoek B, Sterken M, Nijveen H, Volkers R, Riksen J, Rosenstiel P G3 (Bethesda). 2021; 11(10).

PMID: 34568931 PMC: 8496280. DOI: 10.1093/g3journal/jkab258.


Impact of genotypic errors with equal and unequal family contribution on accuracy of genomic prediction in aquaculture using simulation.

Khalilisamani N, Thomson P, Raadsma H, Khatkar M Sci Rep. 2021; 11(1):18318.

PMID: 34526591 PMC: 8443606. DOI: 10.1038/s41598-021-97873-5.


iDEP Web Application for RNA-Seq Data Analysis.

Ge X Methods Mol Biol. 2021; 2284:417-443.

PMID: 33835455 DOI: 10.1007/978-1-0716-1307-8_22.


Comparative analysis of transcriptomic profile, histology, and IDH mutation for classification of gliomas.

Tran P, Tran L, Nechtman J, Santos B, Purohit S, Bin Satter K Sci Rep. 2020; 10(1):20651.

PMID: 33244057 PMC: 7692499. DOI: 10.1038/s41598-020-77777-6.


References
1.
Jansen R . Controlling the type I and type II errors in mapping quantitative trait loci. Genetics. 1994; 138(3):871-81. PMC: 1206235. DOI: 10.1093/genetics/138.3.871. View

2.
Schadt E, Lamb J, Yang X, Zhu J, Edwards S, GuhaThakurta D . An integrative genomics approach to infer causal associations between gene expression and disease. Nat Genet. 2005; 37(7):710-7. PMC: 2841396. DOI: 10.1038/ng1589. View

3.
Ongen H, Andersen C, Bramsen J, Oster B, Rasmussen M, Ferreira P . Putative cis-regulatory drivers in colorectal cancer. Nature. 2014; 512(7512):87-90. DOI: 10.1038/nature13602. View

4.
Ekstrom C, Feenstra B . Detecting sample misidentifications in genetic association studies. Stat Appl Genet Mol Biol. 2012; 11(3):Article 13. DOI: 10.1515/1544-6115.1772. View

5.
Vinuela A, Snoek L, Riksen J, Kammenga J . Genome-wide gene expression regulation as a function of genotype and age in C. elegans. Genome Res. 2010; 20(7):929-37. PMC: 2892094. DOI: 10.1101/gr.102160.109. View