» Articles » PMID: 23816787

Kraken: a Set of Tools for Quality Control and Analysis of High-throughput Sequence Data

Overview
Journal Methods
Specialty Biochemistry
Date 2013 Jul 3
PMID 23816787
Citations 235
Authors
Affiliations
Soon will be listed here.
Abstract

New sequencing technologies pose significant challenges in terms of data complexity and magnitude. It is essential that efficient software is developed with performance that scales with this growth in sequence information. Here we present a comprehensive and integrated set of tools for the analysis of data from large scale sequencing experiments. It supports adapter detection and removal, demultiplexing of barcodes, paired-end data, a range of read architectures and the efficient removal of sequence redundancy. Sequences can be trimmed and filtered based on length, quality and complexity. Quality control plots track sequence length, composition and summary statistics with respect to genomic annotation. Several use cases have been integrated into a single streamlined pipeline, including both mRNA and small RNA sequencing experiments. This pipeline interfaces with existing tools for genomic mapping and differential expression analysis.

Citing Articles

Regulation of alternative splicing by CBF-mediated protein condensation in plant response to cold stress.

Fu D, Song Y, Wu S, Peng Y, Ming Y, Li Z Nat Plants. 2025; .

PMID: 40044940 DOI: 10.1038/s41477-025-01933-x.


Effects of green manuring on chemical characteristics and microecology of tobacco-growing soil in central henan.

Liu W, Chen X, Zhao Y, Shi H BMC Microbiol. 2025; 25(1):42.

PMID: 39856595 PMC: 11761188. DOI: 10.1186/s12866-025-03742-w.


The TLR7/9 adaptors TASL and TASL2 mediate IRF5-dependent antiviral responses and autoimmunity in mouse.

Drobek A, Bernaleau L, Delacretaz M, Calderon Copete S, Royer-Chardon C, Longepierre M Nat Commun. 2025; 16(1):967.

PMID: 39856058 PMC: 11759703. DOI: 10.1038/s41467-024-55692-y.


Plasma miRNAs Correlate with Structural Brain and Cardiac Damage in Friedreich's Ataxia.

Peluzzo T, Vieira A, Matos A, Silveira C, Martin M, Filho O Cerebellum. 2024; 24(1):15.

PMID: 39688804 DOI: 10.1007/s12311-024-01766-y.


Evolutionary Nonindependence Between Human piRNAs and Their Potential Target Sites in Protein-Coding Genes.

He C, Zhu H J Mol Evol. 2024; 93(1):83-99.

PMID: 39621077 DOI: 10.1007/s00239-024-10220-w.


References
1.
Lohse M, Bolger A, Nagel A, Fernie A, Lunn J, Stitt M . RobiNA: a user-friendly, integrated software solution for RNA-Seq-based transcriptomics. Nucleic Acids Res. 2012; 40(Web Server issue):W622-7. PMC: 3394330. DOI: 10.1093/nar/gks540. View

2.
Taub M, Corrada Bravo H, Irizarry R . Overcoming bias and systematic errors in next generation sequencing data. Genome Med. 2010; 2(12):87. PMC: 3025429. DOI: 10.1186/gm208. View

3.
Nekrutenko A, Taylor J . Next-generation sequencing data interpretation: enhancing reproducibility and accessibility. Nat Rev Genet. 2012; 13(9):667-72. DOI: 10.1038/nrg3305. View

4.
Kong Y . Btrim: a fast, lightweight adapter and quality trimming program for next-generation sequencing technologies. Genomics. 2011; 98(2):152-3. DOI: 10.1016/j.ygeno.2011.05.009. View

5.
Gentleman R, Carey V, Bates D, Bolstad B, Dettling M, Dudoit S . Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004; 5(10):R80. PMC: 545600. DOI: 10.1186/gb-2004-5-10-r80. View