Kerfdr: a Semi-parametric Kernel-based Approach to Local False Discovery Rate Estimation

Overview

Journal BMC Bioinformatics

Publisher Biomed Central

Specialty Biology

Date 2009 Mar 18

PMID 19291295

Citations 9

Authors

Mickael Guedj

Stephane Robin

Alain Celisse

Gregory Nuel

Affiliations

Soon will be listed here.

Abstract

Background: The use of current high-throughput genetic, genomic and post-genomic data leads to the simultaneous evaluation of a large number of statistical hypothesis and, at the same time, to the multiple-testing problem. As an alternative to the too conservative Family-Wise Error-Rate (FWER), the False Discovery Rate (FDR) has appeared for the last ten years as more appropriate to handle this problem. However one drawback of FDR is related to a given rejection region for the considered statistics, attributing the same value to those that are close to the boundary and those that are not. As a result, the local FDR has been recently proposed to quantify the specific probability for a given null hypothesis to be true.

Results: In this context we present a semi-parametric approach based on kernel estimators which is applied to different high-throughput biological data such as patterns in DNA sequences, genes expression and genome-wide association studies.

Conclusion: The proposed method has the practical advantages, over existing approaches, to consider complex heterogeneities in the alternative hypothesis, to take into account prior information (from an expert judgment or previous studies) by allowing a semi-supervised mode, and to deal with truncated distributions such as those obtained in Monte-Carlo simulations. This method has been implemented and is available through the R package kerfdr via the CRAN or at (http://stat.genopole.cnrs.fr/software/kerfdr).

Citing Articles

Mitochondrial Transcriptome Control and Intercompartment Cross-Talk During Plant Development.

Niazi A, Delannoy E, Iqbal R, Mileshina D, Val R, Gabryelska M Cells. 2019; 8(6).

PMID: 31200566 PMC: 6627697. DOI: 10.3390/cells8060583.

Non-parametric estimation of survival in age-dependent genetic disease and application to the transthyretin-related hereditary amyloidosis.

Alarcon F, Plante-Bordeneuve V, Olsson M, Nuel G PLoS One. 2018; 13(9):e0203860.

PMID: 30252892 PMC: 6155453. DOI: 10.1371/journal.pone.0203860.

A two-stage hidden Markov model design for biomarker detection, with application to microbiome research.

Zhou Y, Brooks P, Wang X Stat Biosci. 2018; 10(1):41-58.

PMID: 30174757 PMC: 6116560. DOI: 10.1007/s12561-017-9187-y.

Local false discovery rate estimation using feature reliability in LC/MS metabolomics data.

Chong E, Huang Y, Wu H, Ghasemzadeh N, Uppal K, Quyyumi A Sci Rep. 2015; 5:17221.

PMID: 26596774 PMC: 4657040. DOI: 10.1038/srep17221.

Genotype by watering regime interaction in cultivated tomato: lessons from linkage mapping and gene expression.

Albert E, Gricourt J, Bertin N, Bonnefoi J, Pateyron S, Tamby J Theor Appl Genet. 2015; 129(2):395-418.

PMID: 26582510 DOI: 10.1007/s00122-015-2635-5.

References

McLachlan G, Bean R, Ben-Tovim Jones L . A simple implementation of a normal mixture approach to differential gene expression in multiclass microarrays. Bioinformatics. 2006; 22(13):1608-15. DOI: 10.1093/bioinformatics/btl148. View

Delmar P, Robin S, Daudin J . VarMixt: efficient variance modelling for the differential analysis of replicated gene expression data. Bioinformatics. 2004; 21(4):502-8. DOI: 10.1093/bioinformatics/bti023. View

Balding D . A tutorial on statistical methods for population association studies. Nat Rev Genet. 2006; 7(10):781-91. DOI: 10.1038/nrg1916. View

Pounds S . Estimation and control of multiple testing error rates for microarray studies. Brief Bioinform. 2006; 7(1):25-36. DOI: 10.1093/bib/bbk002. View

Liao J, Lin Y, Selvanayagam Z, Shih W . A mixture model for estimating the local false discovery rate in DNA microarray analysis. Bioinformatics. 2004; 20(16):2694-701. DOI: 10.1093/bioinformatics/bth310. View

Forner K, Lamarine M, Guedj M, Dauvillier J, Wojcik J . Universal false discovery rate estimation methodology for genome-wide association studies. Hum Hered. 2007; 65(4):183-94. DOI: 10.1159/000112365. View

Hedenfalk I, Duggan D, Chen Y, Radmacher M, Bittner M, Simon R . Gene-expression profiles in hereditary breast cancer. N Engl J Med. 2001; 344(8):539-48. DOI: 10.1056/NEJM200102223440801. View

Pounds S, Morris S . Estimating the occurrence of false positives and false negatives in microarray studies by approximating and partitioning the empirical distribution of p-values. Bioinformatics. 2003; 19(10):1236-42. DOI: 10.1093/bioinformatics/btg148. View

Pounds S, Cheng C . Robust estimation of the false discovery rate. Bioinformatics. 2006; 22(16):1979-87. DOI: 10.1093/bioinformatics/btl328. View

10.

Matsuzaki H, Dong S, Loi H, Di X, Liu G, Hubbell E . Genotyping over 100,000 SNPs on a pair of oligonucleotide arrays. Nat Methods. 2005; 1(2):109-11. DOI: 10.1038/nmeth718. View

11.

Aubert J, Bar-Hen A, Daudin J, Robin S . Determination of the differentially expressed genes in microarray experiments using local FDR. BMC Bioinformatics. 2004; 5:125. PMC: 520755. DOI: 10.1186/1471-2105-5-125. View

12.

Ferreira J . The Benjamini-Hochberg method in the case of discrete test statistics. Int J Biostat. 2012; 3(1):Article 11. DOI: 10.2202/1557-4679.1065. View

13.

Newton M, Noueiry A, Sarkar D, Ahlquist P . Detecting differential gene expression with a semiparametric hierarchical mixture method. Biostatistics. 2004; 5(2):155-76. DOI: 10.1093/biostatistics/5.2.155. View

14.

Broet P, Lewin A, Richardson S, Dalmasso C, Magdelenat H . A mixture model-based strategy for selecting sets of genes in multiclass response microarray experiments. Bioinformatics. 2004; 20(16):2562-71. DOI: 10.1093/bioinformatics/bth285. View