» Articles » PMID: 20601685

ANNOVAR: Functional Annotation of Genetic Variants from High-throughput Sequencing Data

Overview
Specialty Biochemistry
Date 2010 Jul 6
PMID 20601685
Citations 7432
Authors
Affiliations
Soon will be listed here.
Abstract

High-throughput sequencing platforms are generating massive amounts of genetic variation data for diverse genomes, but it remains a challenge to pinpoint a small subset of functionally important variants. To fill these unmet needs, we developed the ANNOVAR tool to annotate single nucleotide variants (SNVs) and insertions/deletions, such as examining their functional consequence on genes, inferring cytogenetic bands, reporting functional importance scores, finding variants in conserved regions, or identifying variants reported in the 1000 Genomes Project and dbSNP. ANNOVAR can utilize annotation databases from the UCSC Genome Browser or any annotation data set conforming to Generic Feature Format version 3 (GFF3). We also illustrate a 'variants reduction' protocol on 4.7 million SNVs and indels from a human genome, including two causal mutations for Miller syndrome, a rare recessive disease. Through a stepwise procedure, we excluded variants that are unlikely to be causal, and identified 20 candidate genes including the causal gene. Using a desktop computer, ANNOVAR requires ∼4 min to perform gene-based annotation and ∼15 min to perform variants reduction on 4.7 million variants, making it practical to handle hundreds of human genomes in a day. ANNOVAR is freely available at http://www.openbioinformatics.org/annovar/.

Citing Articles

Direct measurement of the male germline mutation rate in individuals using sequential sperm samples.

Shoag J, Srinivasa A, Loh C, Liu M, Lassen E, Melanaphy S Nat Commun. 2025; 16(1):2546.

PMID: 40089484 DOI: 10.1038/s41467-025-57507-0.


Benchmarking long-read structural variant calling tools and combinations for detecting somatic variants in cancer genomes.

Aydin S, Yilmaz K, Acar A Sci Rep. 2025; 15(1):8707.

PMID: 40082509 PMC: 11906795. DOI: 10.1038/s41598-025-92750-x.


Analysis of Population Structure and Selective Signatures for Milk Production Traits in Xinjiang Brown Cattle and Chinese Simmental Cattle.

Ma K, Li X, Ma S, Zhang M, Wang D, Xu L Int J Mol Sci. 2025; 26(5).

PMID: 40076627 PMC: 11900343. DOI: 10.3390/ijms26052003.


Genomic Insights into the Population Genetics and Adaptive Evolution of Yellow Seabream () with Whole-Genome Resequencing.

Li Y, Yang J, Fang Y, Zhang R, Cai Z, Shan B Animals (Basel). 2025; 15(5).

PMID: 40076030 PMC: 11898413. DOI: 10.3390/ani15050745.


Primary exploration of cell-free DNA in the plasma of patients with parathyroid neoplasms using next-generation sequencing.

Zheng Q, Cui M, Wang O, Chang X, Xiao J, Chen T Cancer Cell Int. 2025; 25(1):86.

PMID: 40075389 PMC: 11905564. DOI: 10.1186/s12935-025-03699-w.


References
1.
Pruitt K, Tatusova T, Maglott D . NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2006; 35(Database issue):D61-5. PMC: 1716718. DOI: 10.1093/nar/gkl842. View

2.
Siepel A, Bejerano G, Pedersen J, Hinrichs A, Hou M, Rosenbloom K . Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005; 15(8):1034-50. PMC: 1182216. DOI: 10.1101/gr.3715005. View

3.
den Dunnen J, Antonarakis S . Nomenclature for the description of human sequence variations. Hum Genet. 2001; 109(1):121-4. DOI: 10.1007/s004390100505. View

4.
Ge D, Zhang K, Need A, Martin O, Fellay J, Urban T . WGAViewer: software for genomic annotation of whole genome association studies. Genome Res. 2008; 18(4):640-3. PMC: 2279251. DOI: 10.1101/gr.071571.107. View

5.
Ramensky V, Bork P, Sunyaev S . Human non-synonymous SNPs: server and survey. Nucleic Acids Res. 2002; 30(17):3894-900. PMC: 137415. DOI: 10.1093/nar/gkf493. View