» Articles » PMID: 28666376

A Massively Parallel Strategy for STR Marker Development, Capture, and Genotyping

Overview
Specialty Biochemistry
Date 2017 Jul 2
PMID 28666376
Citations 5
Authors
Affiliations
Soon will be listed here.
Abstract

Short tandem repeat (STR) variants are highly polymorphic markers that facilitate powerful population genetic analyses. STRs are especially valuable in conservation and ecological genetic research, yielding detailed information on population structure and short-term demographic fluctuations. Massively parallel sequencing has not previously been leveraged for scalable, efficient STR recovery. Here, we present a pipeline for developing STR markers directly from high-throughput shotgun sequencing data without a reference genome, and an approach for highly parallel target STR recovery. We employed our approach to capture a panel of 5000 STRs from a test group of diademed sifakas (Propithecus diadema, n = 3), endangered Malagasy rainforest lemurs, and we report extremely efficient recovery of targeted loci-97.3-99.6% of STRs characterized with ≥10x non-redundant sequence coverage. We then tested our STR capture strategy on P. diadema fecal DNA, and report robust initial results and suggestions for future implementations. In addition to STR targets, this approach also generates large, genome-wide single nucleotide polymorphism (SNP) panels from flanking regions. Our method provides a cost-effective and scalable solution for rapid recovery of large STR and SNP datasets in any species without needing a reference genome, and can be used even with suboptimal DNA more easily acquired in conservation and ecological studies.

Citing Articles

USAT: a bioinformatic toolkit to facilitate interpretation and comparative visualization of tandem repeat sequences.

Wang X, Budowle B, Ge J BMC Bioinformatics. 2022; 23(1):497.

PMID: 36402991 PMC: 9675219. DOI: 10.1186/s12859-022-05021-1.


BigFiRSt: A Software Program Using Big Data Technique for Mining Simple Sequence Repeats From Large-Scale Sequencing Data.

Chen J, Li F, Wang M, Li J, Marquez-Lago T, Leier A Front Big Data. 2022; 4:727216.

PMID: 35118375 PMC: 8805145. DOI: 10.3389/fdata.2021.727216.


Population-level inferences from environmental DNA-Current status and future perspectives.

Sigsgaard E, Jensen M, Winkelmann I, Rask Moller P, Hansen M, Thomsen P Evol Appl. 2020; 13(2):245-262.

PMID: 31993074 PMC: 6976968. DOI: 10.1111/eva.12882.


Genetic and genomic monitoring with minimally invasive sampling methods.

Carroll E, Bruford M, DeWoody J, Leroy G, Strand A, Waits L Evol Appl. 2018; 11(7):1094-1119.

PMID: 30026800 PMC: 6050181. DOI: 10.1111/eva.12600.


SONiCS: PCR stutter noise correction in genome-scale microsatellites.

Kedzierska K, Gerber L, Cagnazzi D, Krutzen M, Ratan A, Kistler L Bioinformatics. 2018; 34(23):4115-4117.

PMID: 29931218 PMC: 6454461. DOI: 10.1093/bioinformatics/bty485.

References
1.
Schlotterer C, Tautz D . Slippage synthesis of simple sequence DNA. Nucleic Acids Res. 1992; 20(2):211-5. PMC: 310356. DOI: 10.1093/nar/20.2.211. View

2.
Snyder-Mackler N, Majoros W, Yuan M, Shaver A, Gordon J, Kopp G . Efficient Genome-Wide Sequencing and Low-Coverage Pedigree Analysis from Noninvasively Collected Samples. Genetics. 2016; 203(2):699-714. PMC: 4896188. DOI: 10.1534/genetics.116.187492. View

3.
Quemere E, Amelot X, Pierson J, Crouau-Roy B, Chikhi L . Genetic data suggest a natural prehuman origin of open habitats in northern Madagascar and question the deforestation narrative in this region. Proc Natl Acad Sci U S A. 2012; 109(32):13028-33. PMC: 3420155. DOI: 10.1073/pnas.1200153109. View

4.
Agrafioti I, Stumpf M . SNPSTR: a database of compound microsatellite-SNP markers. Nucleic Acids Res. 2007; 35(Database issue):D71-5. PMC: 1899107. DOI: 10.1093/nar/gkl806. View

5.
Meyer M, Kircher M . Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harb Protoc. 2010; 2010(6):pdb.prot5448. DOI: 10.1101/pdb.prot5448. View