» Articles » PMID: 40004443

Comparative Study of Statistical Approaches and SNP Panels to Infer Distant Relationships in Forensic Genetics

Overview
Journal Genes (Basel)
Publisher MDPI
Date 2025 Feb 26
PMID 40004443
Authors
Affiliations
Soon will be listed here.
Abstract

: Inferring genetic relationships based on genetic data has gained an increasing focus in the last years, in particular explained by the rise of forensic investigative genetic genealogy (FIGG) but also the introduction of expanded SNP panels in forensic genetics. A plethora of statistical methods are used throughout publications; in direct-to-consumer (DTC) testing, the shared segment approach is used, in screenings of relationships in medical genetic research, for instance, methods-of-moment estimators, e.g., estimation of the kinship coefficient, are used, and in forensic genetics, the likelihood and the likelihood ratio are commonly used to evaluate the genetic data under competing hypotheses. This current study aims to compare and contrast examples of the aforementioned statistical methods to infer relationships from genetic data. : This study includes some historical and some recently published panels of SNP markers to illustrate the strength and caveats of the statistical methods on different marker sets and a selection of pre-defined pairwise relationships, 1st through 7th degree. Extensive simulations are performed and subsequently subsetted based on the marker panels alluded to above. As has been shown in previous research, the likelihood ratio is most powerful, i.e., high correct classifications, when SNP data are sparse, say below 20,000 markers, whereas the windowed kinships and segment approaches are equally powerful when very dense SNP data are available, say >20,000 markers. In between lay approaches using method-of-moments estimators which perform well when the degree of relationship is below four but less so beyond, say, 4th degree relationships. The likelihood ratio is the only method that is easily adapted for non-pairwise tests and therefore has an additional depth not addressed in the current study. We furthermore perform a study of genotyping error rates and their impact on the different statistical methods employed to infer relationships, where the results show that error rates below 1% seem to have low impact across all methods, in particular for errors yielding false heterozygote genotypes.

References
1.
Sudmant P, Rausch T, Gardner E, Handsaker R, Abyzov A, Huddleston J . An integrated map of structural variation in 2,504 human genomes. Nature. 2015; 526(7571):75-81. PMC: 4617611. DOI: 10.1038/nature15394. View

2.
Antunes J, Walichiewicz P, Forouzmand E, Barta R, Didier M, Han Y . Developmental validation of the ForenSeq® Kintelligence kit, MiSeq FGx® sequencing system and ForenSeq Universal Analysis Software. Forensic Sci Int Genet. 2024; 71:103055. DOI: 10.1016/j.fsigen.2024.103055. View

3.
Korneliussen T, Moltke I . NgsRelate: a software tool for estimating pairwise relatedness from next-generation sequencing data. Bioinformatics. 2015; 31(24):4009-11. PMC: 4673978. DOI: 10.1093/bioinformatics/btv509. View

4.
Snedecor J, Fennell T, Stadick S, Homer N, Antunes J, Stephens K . Fast and accurate kinship estimation using sparse SNPs in relatively large database searches. Forensic Sci Int Genet. 2022; 61:102769. DOI: 10.1016/j.fsigen.2022.102769. View

5.
Kling D . On the use of dense sets of SNP markers and their potential in relationship inference. Forensic Sci Int Genet. 2018; 39:19-31. DOI: 10.1016/j.fsigen.2018.11.022. View