» Articles » PMID: 25183248

Correcting Illumina Data

Overview
Journal Brief Bioinform
Specialty Biology
Date 2014 Sep 4
PMID 25183248
Citations 18
Authors
Affiliations
Soon will be listed here.
Abstract

Next-generation sequencing technologies revolutionized the ways in which genetic information is obtained and have opened the door for many essential applications in biomedical sciences. Hundreds of gigabytes of data are being produced, and all applications are affected by the errors in the data. Many programs have been designed to correct these errors, most of them targeting the data produced by the dominant technology of Illumina. We present a thorough comparison of these programs. Both HiSeq and MiSeq types of Illumina data are analyzed, and correcting performance is evaluated as the gain in depth and breadth of coverage, as given by correct reads and k-mers. Time and memory requirements, scalability and parallelism are considered as well. Practical guidelines are provided for the effective use of these tools. We also evaluate the efficiency of the current state-of-the-art programs for correcting Illumina data and provide research directions for further improvement.

Citing Articles

MAC-ErrorReads: machine learning-assisted classifier for filtering erroneous NGS reads.

Sami A, El-Metwally S, Rashad M BMC Bioinformatics. 2024; 25(1):61.

PMID: 38321434 PMC: 10848413. DOI: 10.1186/s12859-024-05681-1.


Illumina reads correction: evaluation and improvements.

Dlugosz M, Deorowicz S Sci Rep. 2024; 14(1):2232.

PMID: 38278837 PMC: 11222498. DOI: 10.1038/s41598-024-52386-9.


ReSeq simulates realistic Illumina high-throughput sequencing data.

Schmeing S, Robinson M Genome Biol. 2021; 22(1):67.

PMID: 33608040 PMC: 7896392. DOI: 10.1186/s13059-021-02265-7.


Benchmarking of computational error-correction methods for next-generation sequencing data.

Mitchell K, Brito J, Mandric I, Wu Q, Knyazev S, Chang S Genome Biol. 2020; 21(1):71.

PMID: 32183840 PMC: 7079412. DOI: 10.1186/s13059-020-01988-3.


Athena: Automated Tuning of k-mer based Genomic Error Correction Algorithms using Language Models.

Abdallah M, Mahgoub A, Ahmed H, Chaterji S Sci Rep. 2019; 9(1):16157.

PMID: 31695060 PMC: 6834855. DOI: 10.1038/s41598-019-52196-4.