» Articles » PMID: 28369201

GenomeScope: Fast Reference-free Genome Profiling from Short Reads

Overview
Journal Bioinformatics
Specialty Biology
Date 2017 Apr 4
PMID 28369201
Citations 837
Authors
Affiliations
Soon will be listed here.
Abstract

Summary: GenomeScope is an open-source web tool to rapidly estimate the overall characteristics of a genome, including genome size, heterozygosity rate and repeat content from unprocessed short reads. These features are essential for studying genome evolution, and help to choose parameters for downstream analysis. We demonstrate its accuracy on 324 simulated and 16 real datasets with a wide range in genome sizes, heterozygosity levels and error rates.

Availability And Implementation: http://genomescope.org , https://github.com/schatzlab/genomescope.git .

Contact: mschatz@jhu.edu.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Citing Articles

Chromosome-level genome assembly of a critically endangered species Leuciscus chuanchicus.

Wang Q, Zhou Q, Liu H, Li J, Jiang Y Sci Data. 2025; 12(1):441.

PMID: 40089515 DOI: 10.1038/s41597-025-04787-2.


A chromosomal-level genome assembly of Begonia fimbristipula (Begoniaceae).

Xiao T, Wang Z, Yan H Sci Data. 2025; 12(1):429.

PMID: 40074751 PMC: 11904028. DOI: 10.1038/s41597-025-04768-5.


Chromosome-level genome assembly of the clam, Xishi tongue Coelomactra antiquata.

Shen Y, Wang Y, Kong L Sci Data. 2025; 12(1):422.

PMID: 40069159 PMC: 11897284. DOI: 10.1038/s41597-025-04734-1.


High-resolution genome assembly and population genetic study of the endangered maple (Sapindaceae): implications for conservation strategies.

Li X, Jiang L, Deng H, Yu Q, Ju W, Chen X Hortic Res. 2025; 12(4):uhae357.

PMID: 40066161 PMC: 11891484. DOI: 10.1093/hr/uhae357.


Chromosome-level genome assembly and annotation of pawak croaker (Pennahia pawak).

Jiang L, Zheng P, Zheng J, Liu Y, Song W, Chen S Sci Data. 2025; 12(1):412.

PMID: 40064941 PMC: 11894176. DOI: 10.1038/s41597-025-04745-y.


References
1.
Kelley D, Schatz M, Salzberg S . Quake: quality-aware detection and correction of sequencing errors. Genome Biol. 2010; 11(11):R116. PMC: 3156955. DOI: 10.1186/gb-2010-11-11-r116. View

2.
Li X, Waterman M . Estimating the repeat structure and length of DNA sequences using L-tuples. Genome Res. 2003; 13(8):1916-22. PMC: 403783. DOI: 10.1101/gr.1251803. View

3.
Goodwin S, McPherson J, McCombie W . Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet. 2016; 17(6):333-51. PMC: 10373632. DOI: 10.1038/nrg.2016.49. View

4.
Simpson J . Exploring genome characteristics and sequence quality without a reference. Bioinformatics. 2014; 30(9):1228-35. PMC: 3998141. DOI: 10.1093/bioinformatics/btu023. View

5.
Phillippy A, Schatz M, Pop M . Genome assembly forensics: finding the elusive mis-assembly. Genome Biol. 2008; 9(3):R55. PMC: 2397507. DOI: 10.1186/gb-2008-9-3-r55. View