» Articles » PMID: 20034392

Next Generation Transcriptomes for Next Generation Genomes Using Est2assembly

Overview
Publisher Biomed Central
Specialty Biology
Date 2009 Dec 26
PMID 20034392
Citations 43
Authors
Affiliations
Soon will be listed here.
Abstract

Background: The decreasing costs of capillary-based Sanger sequencing and next generation technologies, such as 454 pyrosequencing, have prompted an explosion of transcriptome projects in non-model species, where even shallow sequencing of transcriptomes can now be used to examine a range of research questions. This rapid growth in data has outstripped the ability of researchers working on non-model species to analyze and mine transcriptome data efficiently.

Results: Here we present a semi-automated platform 'est2assembly' that processes raw sequence data from Sanger or 454 sequencing into a hybrid de-novo assembly, annotates it and produces GMOD compatible output, including a SeqFeature database suitable for GBrowse. Users are able to parameterize assembler variables, judge assembly quality and determine the optimal assembly for their specific needs. We used est2assembly to process Drosophila and Bicyclus public Sanger EST data and then compared them to published 454 data as well as eight new insect transcriptome collections.

Conclusions: Analysis of such a wide variety of data allows us to understand how these new technologies can assist EST project design. We determine that assembler parameterization is as essential as standardized methods to judge the output of ESTs projects. Further, even shallow sequencing using 454 produces sufficient data to be of wide use to the community. est2assembly is an important tool to assist manual curation for gene models, an important resource in their own right but especially for species which are due to acquire a genome project using Next Generation Sequencing.

Citing Articles

A chromosome-scale assembly of allotetraploid Brassica juncea (AABB) elucidates comparative architecture of the A and B genomes.

Paritosh K, Yadava S, Singh P, Bhayana L, Mukhopadhyay A, Gupta V Plant Biotechnol J. 2020; 19(3):602-614.

PMID: 33073461 PMC: 7955877. DOI: 10.1111/pbi.13492.


Comprehensive transcriptome analysis of Sarcophaga peregrina, a forensically important fly species.

Kim J, Lim H, Shin S, Cha H, Seo J, Kim S Sci Data. 2018; 5:180220.

PMID: 30398471 PMC: 6219405. DOI: 10.1038/sdata.2018.220.


Transcriptome analysis of hexaploid hulless oat in response to salinity stress.

Wu B, Hu Y, Huo P, Zhang Q, Chen X, Zhang Z PLoS One. 2017; 12(2):e0171451.

PMID: 28192458 PMC: 5305263. DOI: 10.1371/journal.pone.0171451.


The life cycle of a genome project: perspectives and guidelines inspired by insect genome projects.

Papanicolaou A F1000Res. 2016; 5:18.

PMID: 27006757 PMC: 4798206. DOI: 10.12688/f1000research.7559.1.


First Microsatellite Markers Developed from Cupuassu ESTs: Application in Diversity Analysis and Cross-Species Transferability to Cacao.

Ferraz Dos Santos L, Moreira Fregapani R, Falcao L, Togawa R, Costa M, Lopes U PLoS One. 2016; 11(3):e0151074.

PMID: 26949967 PMC: 4780773. DOI: 10.1371/journal.pone.0151074.


References
1.
Thomson R, Shedlock A, Edwards S, Shaffer H . Developing markers for multilocus phylogenetics in non-model organisms: A test case with turtles. Mol Phylogenet Evol. 2008; 49(2):514-25. DOI: 10.1016/j.ympev.2008.08.006. View

2.
Solignac M, Zhang L, Mougel F, Li B, Vautrin D, Monnerot M . The genome of Apis mellifera: dialog between linkage mapping and sequence assembly. Genome Biol. 2007; 8(3):403. PMC: 1868943. DOI: 10.1186/gb-2007-8-3-403. View

3.
Rice P, Longden I, Bleasby A . EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 2000; 16(6):276-7. DOI: 10.1016/s0168-9525(00)02024-2. View

4.
Friedel C, Jahn K, Sommer S, Rudd S, Mewes H, Tetko I . Support vector machines for separation of mixed plant-pathogen EST collections based on codon usage. Bioinformatics. 2004; 21(8):1383-8. DOI: 10.1093/bioinformatics/bti200. View

5.
Boguski M, Lowe T, Tolstoshev C . dbEST--database for "expressed sequence tags". Nat Genet. 1993; 4(4):332-3. DOI: 10.1038/ng0893-332. View