ESTprep: Preprocessing CDNA Sequence Reads
Overview
Authors
Affiliations
Motivation: High accuracy of data always governs the large-scale gene discovery projects. The data should not only be trustworthy but should be correctly annotated for various features it contains. Sequence errors are inherent in single-pass sequences such as ESTs obtained from automated sequencing. These errors further complicate the automated identification of EST-related sequencing. A tool is required to prepare the data prior to advanced annotation processing and submission to public databases.
Results: This paper describes ESTprep, a program designed to preprocess expressed sequence tag (EST) sequences. It identifies the location of features present in ESTs and allows the sequence to pass only if it meets various quality criteria. Use of ESTprep has resulted in substantial improvement in accurate EST feature identification and fidelity of results submitted to GenBank.
Availability: The program is freely available for download from http://genome.uiowa.edu/pubsoft/software.html
The mining of toxin-like polypeptides from EST database by single residue distribution analysis.
Kozlov S, Grishin E BMC Genomics. 2011; 12:88.
PMID: 21281459 PMC: 3040730. DOI: 10.1186/1471-2164-12-88.
Macagno E, Gaasterland T, Edsall L, Bafna V, Soares M, Scheetz T BMC Genomics. 2010; 11:407.
PMID: 20579359 PMC: 2996935. DOI: 10.1186/1471-2164-11-407.
SeqTrim: a high-throughput pipeline for pre-processing any type of sequence read.
Falgueras J, Lara A, Fernandez-Pozo N, Canton F, Perez-Trabado G, Claros M BMC Bioinformatics. 2010; 11:38.
PMID: 20089148 PMC: 2832897. DOI: 10.1186/1471-2105-11-38.
ESTPiper--a web-based analysis pipeline for expressed sequence tags.
Tang Z, Choi J, Hemmerich C, Sarangi A, Colbourne J, Dong Q BMC Genomics. 2009; 10:174.
PMID: 19383159 PMC: 2676306. DOI: 10.1186/1471-2164-10-174.
WebTraceMiner: a web service for processing and mining EST sequence trace files.
Liang C, Wang G, Liu L, Ji G, Liu Y, Chen J Nucleic Acids Res. 2007; 35(Web Server issue):W137-42.
PMID: 17488839 PMC: 1933163. DOI: 10.1093/nar/gkm299.