» Articles » PMID: 25260700

HTSeq--a Python Framework to Work with High-throughput Sequencing Data

Overview
Journal Bioinformatics
Specialty Biology
Date 2014 Sep 28
PMID 25260700
Citations 11252
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard workflows, custom scripts are needed.

Results: We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data, such as genomic coordinates, sequences, sequencing reads, alignments, gene model information and variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes.

Availability And Implementation: HTSeq is released as an open-source software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index at https://pypi.python.org/pypi/HTSeq.

Citing Articles

A TaSnRK1α-TaCAT2 model mediates resistance to Fusarium crown rot by scavenging ROS in common wheat.

Yang X, Zhang L, Wei J, Liu L, Liu D, Yan X Nat Commun. 2025; 16(1):2549.

PMID: 40089587 DOI: 10.1038/s41467-025-57936-x.


SIRT5 safeguards against primate skeletal muscle ageing via desuccinylation of TBK1.

Zhao Q, Jing Y, Jiang X, Zhang X, Liu F, Huang H Nat Metab. 2025; .

PMID: 40087407 DOI: 10.1038/s42255-025-01235-8.


Expression of ENL YEATS domain tumor mutations in nephrogenic or stromal lineage impairs kidney development.

Xue Z, Xuan H, Lau K, Su Y, Wegener M, Li K Nat Commun. 2025; 16(1):2531.

PMID: 40087269 DOI: 10.1038/s41467-025-57926-z.


GWAS and transcriptome analyses unravel ZmGRAS15 regulates drought tolerance and root elongation in maize.

Wang D, Liu X, He G, Wang K, Li Y, Guan H BMC Genomics. 2025; 26(1):246.

PMID: 40082805 PMC: 11907892. DOI: 10.1186/s12864-025-11435-x.


Total tocopherol levels in maize grain depend on chlorophyll biosynthesis within the embryo.

Herr S, Li X, Wu D, Hunter C, Magallanes-Lundback M, Wood J BMC Plant Biol. 2025; 25(1):328.

PMID: 40082754 PMC: 11905637. DOI: 10.1186/s12870-025-06267-6.


References
1.
Cock P, Antao T, Chang J, Chapman B, Cox C, Dalke A . Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics. 2009; 25(11):1422-3. PMC: 2682512. DOI: 10.1093/bioinformatics/btp163. View

2.
Dale R, Pedersen B, Quinlan A . Pybedtools: a flexible Python library for manipulating genomic datasets and annotations. Bioinformatics. 2011; 27(24):3423-4. PMC: 3232365. DOI: 10.1093/bioinformatics/btr539. View

3.
Fonseca N, Marioni J, Brazma A . RNA-Seq gene profiling--a systematic empirical comparison. PLoS One. 2014; 9(9):e107026. PMC: 4182317. DOI: 10.1371/journal.pone.0107026. View

4.
Quinlan A, Hall I . BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010; 26(6):841-2. PMC: 2832824. DOI: 10.1093/bioinformatics/btq033. View

5.
Lawrence M, Huber W, Pages H, Aboyoun P, Carlson M, Gentleman R . Software for computing and annotating genomic ranges. PLoS Comput Biol. 2013; 9(8):e1003118. PMC: 3738458. DOI: 10.1371/journal.pcbi.1003118. View