» Articles » PMID: 7887436

An E-M Algorithm and Testing Strategy for Multiple-locus Haplotypes

Overview
Journal Am J Hum Genet
Publisher Cell Press
Specialty Genetics
Date 1995 Mar 1
PMID 7887436
Citations 160
Authors
Affiliations
Soon will be listed here.
Abstract

This paper gives an expectation maximization (EM) algorithm to obtain allele frequencies, haplotype frequencies, and gametic disequilibrium coefficients for multiple-locus systems. It permits high polymorphism and null alleles at all loci. This approach effectively deals with the primary estimation problems associated with such systems; that is, there is not a one-to-one correspondence between phenotypic and genotypic categories, and sample sizes tend to be much smaller than the number of phenotypic categories. The EM method provides maximum-likelihood estimates and therefore allows hypothesis tests using likelihood ratio statistics that have chi 2 distributions with large sample sizes. We also suggest a data resampling approach to estimate test statistic sampling distributions. The resampling approach is more computer intensive, but it is applicable to all sample sizes. A strategy to test hypotheses about aggregate groups of gametic disequilibrium coefficients is recommended. This strategy minimizes the number of necessary hypothesis tests while at the same time describing the structure of disequilibrium. These methods are applied to three unlinked dinucleotide repeat loci in Navajo Indians and to three linked HLA loci in Gila River (Pima) Indians. The likelihood functions of both data sets are shown to be maximized by the EM estimates, and the testing strategy provides a useful description of the structure of gametic disequilibrium. Following these applications, a number of simulation experiments are performed to test how well the likelihood-ratio statistic distributions are approximated by chi 2 distributions. In most circumstances the chi 2 grossly underestimated the probability of type I errors. However, at times they also overestimated the type 1 error probability. Accordingly, we recommended hypothesis tests that use the resampling method.

Citing Articles

Epistatic interaction between ERAP2 and HLA modulates HIV-1 adaptation and disease outcome in an Australian population.

Al-Kaabi M, Deshpande P, Firth M, Pavlos R, Chopra A, Basiri H PLoS Pathog. 2024; 20(7):e1012359.

PMID: 38980912 PMC: 11259285. DOI: 10.1371/journal.ppat.1012359.


Epistasis Between HLA-DRB1*16:02:01 and SLC16A11 T-C-G-T-T Reduces Odds for Type 2 Diabetes in Southwest American Indians.

Williams R, Hanson R, Peters B, Kearns K, Knowler W, Bogardus C Diabetes. 2024; 73(6):1002-1011.

PMID: 38530923 PMC: 11109785. DOI: 10.2337/db23-0925.


ACCURATE CONSTRUCTION OF LONG RANGE HAPLOTYPE IN UNRELATED INDIVIDUALS.

Johnson N, London S, Romieu I, Wong W, Tang H Stat Sin. 2023; 23:1441-1461.

PMID: 37398638 PMC: 10312227. DOI: 10.5705/ss.2012.141s.


Evaluation of the influence of genetic variants in Cereblon gene on the response to the treatment of erythema nodosum leprosum with thalidomide.

Costa P, Maciel-Fiuza M, Kowalski T, Fraga L, Feira M, Aranha Camargo L Mem Inst Oswaldo Cruz. 2022; 117:e220039.

PMID: 36383784 PMC: 9668341. DOI: 10.1590/0074-02760220039.


Protective association of HLA-DPB1*04:01:01 with acute encephalopathy with biphasic seizures and late reduced diffusion identified by HLA imputation.

Kasai M, Omae Y, Khor S, Shibata A, Hoshino A, Mizuguchi M Genes Immun. 2022; 23(3-4):123-128.

PMID: 35422513 DOI: 10.1038/s41435-022-00170-y.


References
1.
Bowcock A, Ruiz-Linares A, Tomfohrde J, Minch E, Kidd J, Cavalli-Sforza L . High resolution of human evolutionary trees with polymorphic microsatellites. Nature. 1994; 368(6470):455-7. DOI: 10.1038/368455a0. View

2.
Kaplan N, Weir B . Expected behavior of conditional linkage disequilibrium. Am J Hum Genet. 1992; 51(2):333-43. PMC: 1682675. View

3.
Nam J, GART J . On two tests of fit for HLA data with no double blanks. Am J Hum Genet. 1987; 41(1):70-6. PMC: 1684164. View

4.
Hill W, Weir B . Variances and covariances of squared linkage disequilibria in finite populations. Theor Popul Biol. 1988; 33(1):54-78. DOI: 10.1016/0040-5809(88)90004-4. View

5.
Ceppellini R, SINISCALCO M, Smith C . The estimation of gene frequencies in a random-mating population. Ann Hum Genet. 1955; 20(2):97-115. DOI: 10.1111/j.1469-1809.1955.tb01360.x. View