» Articles » PMID: 11593022

The Contribution of 700,000 ORF Sequence Tags to the Definition of the Human Transcriptome

Abstract

Open reading frame expressed sequences tags (ORESTES) differ from conventional ESTs by providing sequence data from the central protein coding portion of transcripts. We generated a total of 696,745 ORESTES sequences from 24 human tissues and used a subset of the data that correspond to a set of 15,095 full-length mRNAs as a means of assessing the efficiency of the strategy and its potential contribution to the definition of the human transcriptome. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most thoroughly sequenced tissue, the breast, the 130,000 ORESTES generated are derived from transcripts from an estimated 70% of all genes expressed in that tissue, with an equally efficient representation of both highly and poorly expressed genes. In this respect, we find that the capacity of the ORESTES strategy both for gene discovery and shotgun transcript sequence generation significantly exceeds that of conventional ESTs. The distribution of ORESTES is such that many human transcripts are now represented by a scaffold of partial sequences distributed along the length of each gene product. The experimental joining of the scaffold components, by reverse transcription-PCR, represents a direct route to transcript finishing that may represent a useful alternative to full-length cDNA cloning.

Citing Articles

In vitro and in silico validation of CA3 and FHL1 downregulation in oral cancer.

Pereira C, de Carvalho A, Silva F, Melendez M, Lessa R, Andrade V BMC Cancer. 2018; 18(1):193.

PMID: 29454310 PMC: 5816396. DOI: 10.1186/s12885-018-4077-3.


Expression of human protein S100A7 (psoriasin), preparation of antibody and application to human larynx squamous cell carcinoma.

Barbieri M, Andrade C, Silva Jr W, Marques A, Leopoldino A, Montes M BMC Res Notes. 2011; 4:494.

PMID: 22082027 PMC: 3278597. DOI: 10.1186/1756-0500-4-494.


Long noncoding intronic RNAs are differentially expressed in primary and metastatic pancreatic cancer.

Tahira A, Kubrusly M, Faria M, Dazzani B, Fonseca R, Maracaja-Coutinho V Mol Cancer. 2011; 10:141.

PMID: 22078386 PMC: 3225313. DOI: 10.1186/1476-4598-10-141.


Temporal blastemal cell gene expression analysis in the kidney reveals new Wnt and related signaling pathway genes to be essential for Wilms' tumor onset.

Maschietto M, Trape A, Piccoli F, Ricca T, Dias A, Coudry R Cell Death Dis. 2011; 2:e224.

PMID: 22048167 PMC: 3223691. DOI: 10.1038/cddis.2011.105.


Gene network analyses point to the importance of human tissue kallikreins in melanoma progression.

Martins W, Esteves G, Almeida O, Rezze G, Landman G, Marques S BMC Med Genomics. 2011; 4:76.

PMID: 22032772 PMC: 3212933. DOI: 10.1186/1755-8794-4-76.


References
1.
Hillier L, Lennon G, Becker M, Bonaldo M, Chiapelli B, Chissoe S . Generation and analysis of 280,000 human expressed sequence tags. Genome Res. 1996; 6(9):807-28. DOI: 10.1101/gr.6.9.807. View

2.
Batzoglou S, Pachter L, Mesirov J, Berger B, Lander E . Human and mouse gene structure: comparative analysis and application to exon prediction. Genome Res. 2000; 10(7):950-8. PMC: 310911. DOI: 10.1101/gr.10.7.950. View

3.
Kawai J, Shinagawa A, Shibata K, Yoshino M, Itoh M, Ishii Y . Functional annotation of a full-length mouse cDNA collection. Nature. 2001; 409(6821):685-90. DOI: 10.1038/35055500. View

4.
Claverie J . Computational methods for the identification of genes in vertebrate genomic sequences. Hum Mol Genet. 1997; 6(10):1735-44. DOI: 10.1093/hmg/6.10.1735. View

5.
Hanke J, Brett D, Zastrow I, Aydin A, Delbruck S, Lehmann G . Alternative splicing of human genes: more the rule than the exception?. Trends Genet. 1999; 15(10):389-90. DOI: 10.1016/s0168-9525(99)01830-2. View