Low-, High-coverage, and Two-stage DNA Sequencing in the Design of the Genetic Association Study
Overview
Public Health
Affiliations
Next-generation sequencing-based genetic association study (GAS) is a powerful tool to identify candidate disease variants and genomic regions. Although low-coverage sequencing offers low cost but inadequacy in calling rare variants, high coverage is able to detect essentially every variant but at a high cost. Two-stage sequencing may be an economical way to conduct GAS without losing power. In two-stage sequencing, an affordable number of samples are sequenced at high coverage as the reference panel, then to impute in a larger sample is sequenced at low coverage. As unit sequencing costs continue to decrease, investigators can now conduct GAS with more flexible sequencing depths. Here, we systematically evaluate the effect of the read depth and sample size on the variant discovery power and association power for study designs using low-coverage, high-coverage, and two-stage sequencing. We consider 12 low-coverage, 12 high-coverage, and 51 two-stage design scenarios with the read depth varying from 0.5× to 80×. With state-of-the-art simulation and analysis packages and in-house scripts, we simulate the complete study process from DNA sequencing to SNP (single nucleotide polymorphism) calling and association testing. Our results show that with appropriate allocation of sequencing effort, two-stage sequencing is an effective approach for conducting GAS. We provide practical guidelines for investigators to plan the optimum sequencing-based GAS including two-stage sequencing design given their specific constraints of sequencing investment.
The ideal reporting of testing in colorectal adenocarcinoma: a pathologists' perspective.
Malapelle U, Angerilli V, Pepe F, Fontanini G, Lonardi S, Scartozzi M Pathologica. 2023; .
PMID: 37314870 PMC: 10462993. DOI: 10.32074/1591-951X-895.
Genomics and Transcriptomics: The Powerful Technologies in Precision Medicine.
Khodadadian A, Darzi S, Haghi-Daredeh S, Eshaghi F, Babakhanzadeh E, Mirabutalebi S Int J Gen Med. 2020; 13:627-640.
PMID: 32982380 PMC: 7509479. DOI: 10.2147/IJGM.S249970.
Medium-coverage DNA sequencing in the design of the genetic association study.
Xu C, Zhang R, Shen H, Deng H Eur J Hum Genet. 2020; 28(10):1459-1466.
PMID: 32457519 PMC: 7608440. DOI: 10.1038/s41431-020-0656-2.
Optimal sequencing depth design for whole genome re-sequencing in pigs.
Jiang Y, Jiang Y, Wang S, Zhang Q, Ding X BMC Bioinformatics. 2019; 20(1):556.
PMID: 31703550 PMC: 6839175. DOI: 10.1186/s12859-019-3164-z.
Chen Z, Boehnke M, Fuchsberger C Genet Epidemiol. 2019; 44(1):41-51.
PMID: 31520493 PMC: 7231418. DOI: 10.1002/gepi.22261.