» Articles » PMID: 38608279

Streamlining Remote Nanopore Data Access with Slow5curl

Overview
Journal Gigascience
Specialties Biology
Genetics
Date 2024 Apr 12
PMID 38608279
Authors
Affiliations
Soon will be listed here.
Abstract

Background: As adoption of nanopore sequencing technology continues to advance, the need to maintain large volumes of raw current signal data for reanalysis with updated algorithms is a growing challenge. Here we introduce slow5curl, a software package designed to streamline nanopore data sharing, accessibility, and reanalysis.

Results: Slow5curl allows a user to fetch a specified read or group of reads from a raw nanopore dataset stored on a remote server, such as a public data repository, without downloading the entire file. Slow5curl uses an index to quickly fetch specific reads from a large dataset in SLOW5/BLOW5 format and highly parallelized data access requests to maximize download speeds. Using all public nanopore data from the Human Pangenome Reference Consortium (>22 TB), we demonstrate how slow5curl can be used to quickly fetch and reanalyze raw signal reads corresponding to a set of target genes from each individual in large cohort dataset (n = 91), minimizing the time, egress costs, and local storage requirements for their reanalysis.

Conclusions: We provide slow5curl as a free, open-source package that will reduce frictions in data sharing for the nanopore community: https://github.com/BonsonW/slow5curl.

Citing Articles

Streamlining remote nanopore data access with slow5curl.

Wong B, Ferguson J, Do J, Gamaarachchi H, Deveson I Gigascience. 2024; 13.

PMID: 38608279 PMC: 11010652. DOI: 10.1093/gigascience/giae016.

References
1.
Shih P, Saadat H, Parameswaran S, Gamaarachchi H . Efficient real-time selective genome sequencing on resource-constrained devices. Gigascience. 2023; 12. PMC: 10316692. DOI: 10.1093/gigascience/giad046. View

2.
Ferguson S, McLay T, Andrew R, Bruhl J, Schwessinger B, Borevitz J . Species-specific basecallers improve actual accuracy of nanopore sequencing in plants. Plant Methods. 2022; 18(1):137. PMC: 9749173. DOI: 10.1186/s13007-022-00971-2. View

3.
Gamaarachchi H, Lam C, Jayatilaka G, Samarakoon H, Simpson J, Smith M . GPU accelerated adaptive banded event alignment for rapid comparative nanopore signal analysis. BMC Bioinformatics. 2020; 21(1):343. PMC: 7430849. DOI: 10.1186/s12859-020-03697-x. View

4.
Gamaarachchi H, Samarakoon H, Jenner S, Ferguson J, Amos T, Hammond J . Fast nanopore sequencing data analysis with SLOW5. Nat Biotechnol. 2022; 40(7):1026-1029. PMC: 9287168. DOI: 10.1038/s41587-021-01147-4. View

5.
Maestri S, Furlan M, Mulroney L, Coscujuela Tarrero L, Ugolini C, Dalla Pozza F . Benchmarking of computational methods for m6A profiling with Nanopore direct RNA sequencing. Brief Bioinform. 2024; 25(2). PMC: 10818168. DOI: 10.1093/bib/bbae001. View