» Articles » PMID: 18194517

LTRharvest, an Efficient and Flexible Software for De Novo Detection of LTR Retrotransposons

Overview
Publisher Biomed Central
Specialty Biology
Date 2008 Jan 16
PMID 18194517
Citations 720
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Transposable elements are abundant in eukaryotic genomes and it is believed that they have a significant impact on the evolution of gene and chromosome structure. While there are several completed eukaryotic genome projects, there are only few high quality genome wide annotations of transposable elements. Therefore, there is a considerable demand for computational identification of transposable elements. LTR retrotransposons, an important subclass of transposable elements, are well suited for computational identification, as they contain long terminal repeats (LTRs).

Results: We have developed a software tool LTRharvest for the de novo detection of full length LTR retrotransposons in large sequence sets. LTRharvest efficiently delivers high quality annotations based on known LTR transposon features like length, distance, and sequence motifs. A quality validation of LTRharvest against a gold standard annotation for Saccharomyces cerevisae and Drosophila melanogaster shows a sensitivity of up to 90% and 97% and specificity of 100% and 72%, respectively. This is comparable or slightly better than annotations for previous software tools. The main advantage of LTRharvest over previous tools is (a) its ability to efficiently handle large datasets from finished or unfinished genome projects, (b) its flexibility in incorporating known sequence features into the prediction, and (c) its availability as an open source software.

Conclusion: LTRharvest is an efficient software tool delivering high quality annotation of LTR retrotransposons. It can, for example, process the largest human chromosome in approx. 8 minutes on a Linux PC with 4 GB of memory. Its flexibility and small space and run-time requirements makes LTRharvest a very competitive candidate for future LTR retrotransposon annotation projects. Moreover, the structured design and implementation and the availability as open source provides an excellent base for incorporating novel concepts to further improve prediction of LTR retrotransposons.

Citing Articles

A near-complete genome assembly of Fragaria iinumae.

Du H, He Y, Chen M, Zheng X, Gui D, Tang J BMC Genomics. 2025; 26(1):253.

PMID: 40087556 DOI: 10.1186/s12864-025-11440-0.


Hordeum I genome unlocks adaptive evolution and genetic potential for crop improvement.

Feng H, Du Q, Jiang Y, Jia Y, He T, Wang Y Nat Plants. 2025; .

PMID: 40087544 DOI: 10.1038/s41477-025-01942-w.


Haplotype-resolved and chromosome-level reference genome assembly of provides insights into the evolution and juvenile growth of persimmon.

Guan C, Liu Y, Li Z, Zhang Y, Liu Z, Zhu Q Hortic Res. 2025; 12(4):uhaf001.

PMID: 40078717 PMC: 11896977. DOI: 10.1093/hr/uhaf001.


The pan-genome of Spodoptera frugiperda provides new insights into genome evolution and horizontal gene transfer.

Huang Y, Rao H, Su B, Lv J, Lin J, Wang X Commun Biol. 2025; 8(1):407.

PMID: 40069391 PMC: 11897360. DOI: 10.1038/s42003-025-07707-7.


Marine vs. terrestrial: links between the environment and the diversity of Copia retrotransposon in metazoans.

Klai K, Farhat S, Lamothe L, Higuet D, Bonnivard E Mob DNA. 2025; 16(1):9.

PMID: 40055832 PMC: 11889832. DOI: 10.1186/s13100-025-00346-z.


References
1.
Xu Z, Wang H . LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 2007; 35(Web Server issue):W265-8. PMC: 1933203. DOI: 10.1093/nar/gkm286. View

2.
Campagna D, Romualdi C, Vitulo N, Del Favero M, Lexa M, Cannata N . RAP: a new computer program for de novo identification of repeated sequences in whole genomes. Bioinformatics. 2004; 21(5):582-8. DOI: 10.1093/bioinformatics/bti039. View

3.
Jurka J . Repbase update: a database and an electronic journal of repetitive elements. Trends Genet. 2000; 16(9):418-20. DOI: 10.1016/s0168-9525(00)02093-x. View

4.
Zhang Z, Schwartz S, Wagner L, Miller W . A greedy algorithm for aligning DNA sequences. J Comput Biol. 2000; 7(1-2):203-14. DOI: 10.1089/10665270050081478. View

5.
McCarthy E, Liu J, Lizhi G, McDonald J . Long terminal repeat retrotransposons of Oryza sativa. Genome Biol. 2002; 3(10):RESEARCH0053. PMC: 134482. DOI: 10.1186/gb-2002-3-10-research0053. View