» Articles » PMID: 31360240

AluMine: Alignment-free Method for the Discovery of Polymorphic Alu Element Insertions

Overview
Journal Mob DNA
Publisher Biomed Central
Specialty Genetics
Date 2019 Jul 31
PMID 31360240
Citations 6
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Recently, alignment-free sequence analysis methods have gained popularity in the field of personal genomics. These methods are based on counting frequencies of short -mer sequences, thus allowing faster and more robust analysis compared to traditional alignment-based methods.

Results: We have created a fast alignment-free method, AluMine, to analyze polymorphic insertions of Alu elements in the human genome. We tested the method on 2,241 individuals from the Estonian Genome Project and identified 28,962 potential polymorphic Alu element insertions. Each tested individual had on average 1,574 Alu element insertions that were different from those in the reference genome. In addition, we propose an alignment-free genotyping method that uses the frequency of insertion/deletion-specific 32-mer pairs to call the genotype directly from raw sequencing reads. Using this method, the concordance between the predicted and experimentally observed genotypes was 98.7%. The running time of the discovery pipeline is approximately 2 h per individual. The genotyping of potential polymorphic insertions takes between 0.4 and 4 h per individual, depending on the hardware configuration.

Conclusions: AluMine provides tools that allow discovery of novel Alu element insertions and/or genotyping of known Alu element insertions from personal genomes within few hours.

Citing Articles

GeneToCN: an alignment-free method for gene copy number estimation directly from next-generation sequencing reads.

Pajuste F, Remm M Sci Rep. 2023; 13(1):17765.

PMID: 37853040 PMC: 10584998. DOI: 10.1038/s41598-023-44636-z.


Genotyping of Transposable Element Insertions Segregating in Human Populations Using Short-Read Realignments.

Chen X, Bourque G, Goubert C Methods Mol Biol. 2022; 2607:63-83.

PMID: 36449158 DOI: 10.1007/978-1-0716-2883-6_4.


Human Retrotransposons and Effective Computational Detection Methods for Next-Generation Sequencing Data.

Lee H, Min J, Mun S, Han K Life (Basel). 2022; 12(10).

PMID: 36295018 PMC: 9605557. DOI: 10.3390/life12101583.


An insertion map of the Indian population: identification and analysis in 1021 genomes of the IndiGen project.

Prakrithi P, Singhal K, Sharma D, Jain A, Bhoyar R, Imran M NAR Genom Bioinform. 2022; 4(1):lqac009.

PMID: 35178516 PMC: 8846365. DOI: 10.1093/nargab/lqac009.


The Simons Genome Diversity Project: A Global Analysis of Mobile Element Diversity.

Watkins W, Feusier J, Thomas J, Goubert C, Mallick S, Jorde L Genome Biol Evol. 2020; 12(6):779-794.

PMID: 32359137 PMC: 7290288. DOI: 10.1093/gbe/evaa086.


References
1.
Deininger P, Batzer M . Alu repeats and human disease. Mol Genet Metab. 1999; 67(3):183-93. DOI: 10.1006/mgme.1999.2864. View

2.
Lander E, Linton L, Birren B, Nusbaum C, Zody M, Baldwin J . Initial sequencing and analysis of the human genome. Nature. 2001; 409(6822):860-921. DOI: 10.1038/35057062. View

3.
Batzer M, Deininger P . Alu repeats and human genomic diversity. Nat Rev Genet. 2002; 3(5):370-9. DOI: 10.1038/nrg798. View

4.
Bailey J, Gu Z, Clark R, Reinert K, Samonte R, Schwartz S . Recent segmental duplications in the human genome. Science. 2002; 297(5583):1003-7. DOI: 10.1126/science.1072047. View

5.
Sebat J, Lakshmi B, Troge J, Alexander J, Young J, Lundin P . Large-scale copy number polymorphism in the human genome. Science. 2004; 305(5683):525-8. DOI: 10.1126/science.1098918. View