» Articles » PMID: 8938416

The Construction of Arabidopsis Expressed Sequence Tag Assemblies. A New Resource to Facilitate Gene Identification

Overview
Journal Plant Physiol
Specialty Physiology
Date 1996 Nov 1
PMID 8938416
Citations 22
Authors
Affiliations
Soon will be listed here.
Abstract

The generation of large numbers of partial cDNA sequences, or expressed sequence tags (ESTs), has provided a method with which to sample a large number of genes from an organism. More than 25,000 Arabidopsis thaliana ESTs have been deposited in public databases, producing the largest collection of ESTs for any plant species. We describe here the application of a method of reducing redundancy and increasing information content in this collection by grouping overlapping ESTs representing the same gene into a "contig" or assembly. The increased information content of these assemblies allows more putative identifications to be assigned based on the results of similarity searches with nucleotide and protein databases. The results of this analysis indicate that sequence information is available for approximately 12,600 nonoverlapping ESTs from Arabidopsis. Comparison of the assemblies with 953 Arabidopsis coding sequences indicates that up to 57% of all Arabidopsis genes are represented by an EST. Clustering analysis of these sequences suggests that between 300 and 700 gene families are represented by between 700 and 2000 sequences in the EST database. A database of the assembled sequences, their putative identifications, and cellular roles is available through the World Wide Web.

Citing Articles

MAKER-P: a tool kit for the rapid creation, management, and quality control of plant genome annotations.

Campbell M, Law M, Holt C, Stein J, Moghe G, Hufnagel D Plant Physiol. 2013; 164(2):513-24.

PMID: 24306534 PMC: 3912085. DOI: 10.1104/pp.113.230144.


Generation, functional annotation and comparative analysis of black spruce (Picea mariana) ESTs: an important conifer genomic resource.

Mann I, Wegrzyn J, Rajora O BMC Genomics. 2013; 14:702.

PMID: 24119028 PMC: 4007752. DOI: 10.1186/1471-2164-14-702.


Expressed sequence tags in cultivated peanut (Arachis hypogaea): discovery of genes in seed development and response to Ralstonia solanacearum challenge.

Huang J, Yan L, Lei Y, Jiang H, Ren X, Liao B J Plant Res. 2012; 125(6):755-69.

PMID: 22648474 DOI: 10.1007/s10265-012-0491-9.


Complementary DNA sequencing and identification of mRNAs from the venomous gland of Agkistrodon piscivorus leucostoma.

Jia Y, Cantu B, Sanchez E, Perez J Toxicon. 2008; 51(8):1457-66.

PMID: 18502463 PMC: 3437923. DOI: 10.1016/j.toxicon.2008.03.028.


hORFeome v3.1: a resource of human open reading frames representing over 10,000 human genes.

Lamesch P, Li N, Milstein S, Fan C, Hao T, Szabo G Genomics. 2007; 89(3):307-15.

PMID: 17207965 PMC: 4647941. DOI: 10.1016/j.ygeno.2006.11.012.


References
1.
Smith T, Waterman M . Identification of common molecular subsequences. J Mol Biol. 1981; 147(1):195-7. DOI: 10.1016/0022-2836(81)90087-5. View

2.
Pearson W, Lipman D . Improved tools for biological sequence comparison. Proc Natl Acad Sci U S A. 1988; 85(8):2444-8. PMC: 280013. DOI: 10.1073/pnas.85.8.2444. View

3.
Altschul S, Gish W, Miller W, Myers E, Lipman D . Basic local alignment search tool. J Mol Biol. 1990; 215(3):403-10. DOI: 10.1016/S0022-2836(05)80360-2. View

4.
Patanjali S, Parimoo S, Weissman S . Construction of a uniform-abundance (normalized) cDNA library. Proc Natl Acad Sci U S A. 1991; 88(5):1943-7. PMC: 51142. DOI: 10.1073/pnas.88.5.1943. View

5.
Adams M, Kelley J, Gocayne J, Dubnick M, Polymeropoulos M, Xiao H . Complementary DNA sequencing: expressed sequence tags and human genome project. Science. 1991; 252(5013):1651-6. DOI: 10.1126/science.2047873. View