» Articles » PMID: 36015406

A Pipeline NanoTRF As a New Tool for Satellite DNA Identification in the Raw Nanopore Sequencing Reads of Plant Genomes

Overview
Journal Plants (Basel)
Date 2022 Aug 26
PMID 36015406
Authors
Affiliations
Soon will be listed here.
Abstract

High-copy tandemly organized repeats (TRs), or satellite DNA, is an important but still enigmatic component of eukaryotic genomes. TRs comprise arrays of multi-copy and highly similar tandem repeats, which makes the elucidation of TRs a very challenging task. Oxford Nanopore sequencing data provide a valuable source of information on TR organization at the single molecule level. However, bioinformatics tools for de novo identification of TRs in raw Nanopore data have not been reported so far. We developed NanoTRF, a new python pipeline for TR repeat identification, characterization and consensus monomer sequence assembly. This new pipeline requires only a raw Nanopore read file from low-depth (<1×) genome sequencing. The program generates an informative html report and figures on TR genome abundance, monomer sequence and monomer length. In addition, NanoTRF performs annotation of transposable elements (TEs) sequences within or near satDNA arrays, and the information can be used to elucidate how TR−TE co-evolve in the genome. Moreover, we validated by FISH that the NanoTRF report is useful for the evaluation of TR chromosome organization—clustered or dispersed. Our findings showed that NanoTRF is a robust method for the de novo identification of satellite repeats in raw Nanopore data without prior read assembly. The obtained sequences can be used in many downstream analyses including genome assembly assistance and gap estimation, chromosome mapping and cytogenetic marker development.

Citing Articles

SatXplor-a comprehensive pipeline for satellite DNA analyses in complex genome assemblies.

Volaric M, Mestrovic N, Despot-Slade E Brief Bioinform. 2024; 26(1).

PMID: 39708839 PMC: 11663013. DOI: 10.1093/bib/bbae660.


Bioinformatics in Russia: history and present-day landscape.

Nawaz M, Pamirsky I, Golokhvast K Brief Bioinform. 2024; 25(6).

PMID: 39402695 PMC: 11473191. DOI: 10.1093/bib/bbae513.


Genome Studies in Four Species of L. (Asteraceae) Using Satellite DNAs as Chromosome Markers.

Samatadze T, Yurkevich O, Khazieva F, Basalaeva I, Savchenko O, Zoshchuk S Plants (Basel). 2023; 12(23).

PMID: 38068691 PMC: 10708038. DOI: 10.3390/plants12234056.


Satellite DNAs-From Localized to Highly Dispersed Genome Components.

Satovic-Vuksic E, Plohl M Genes (Basel). 2023; 14(3).

PMID: 36981013 PMC: 10048060. DOI: 10.3390/genes14030742.


Telomeres and Their Neighbors.

Jenner L, Peska V, Fulneckova J, Sykorova E Genes (Basel). 2022; 13(9).

PMID: 36140830 PMC: 9498494. DOI: 10.3390/genes13091663.

References
1.
Kirov I, Kiseleva A, Van Laere K, Van Roy N, Khrustaleva L . Tandem repeats of Allium fistulosum associated with major chromosomal landmarks. Mol Genet Genomics. 2017; 292(2):453-464. DOI: 10.1007/s00438-016-1286-9. View

2.
Divashuk M, Alexandrov O, Kroupin P, Karlov G . Molecular cytogenetic mapping of Humulus lupulus sex chromosomes. Cytogenet Genome Res. 2011; 134(3):213-9. DOI: 10.1159/000328831. View

3.
Novak P, Robledillo L, Koblizkova A, Vrbova I, Neumann P, Macas J . TAREAN: a computational tool for identification and characterization of satellite DNA from unassembled short reads. Nucleic Acids Res. 2017; 45(12):e111. PMC: 5499541. DOI: 10.1093/nar/gkx257. View

4.
Novak P, Neumann P, Pech J, Steinhaisl J, Macas J . RepeatExplorer: a Galaxy-based web server for genome-wide characterization of eukaryotic repetitive elements from next-generation sequence reads. Bioinformatics. 2013; 29(6):792-3. DOI: 10.1093/bioinformatics/btt054. View

5.
Peona V, Weissensteiner M, Suh A . How complete are "complete" genome assemblies?-An avian perspective. Mol Ecol Resour. 2018; 18(6):1188-1195. DOI: 10.1111/1755-0998.12933. View