Comparing Vertebrate Whole-genome Shotgun Reads to the Human Genome
Overview
Authors
Affiliations
Multi-species sequence comparisons are a very efficient way to reveal conserved genes. Because sequence finishing is expensive and time consuming, many genome sequences are likely to stay incomplete. A challenge is to use these fragmented data for understanding the human genome. Methods for using cross-species whole-genome shotgun sequence (WGS) for genome annotation are described in this paper. About one-half million high-quality rat WGS reads (covering 7.5% of the rat genome) generated at the Baylor College of Medicine Human Genome Sequencing Center were compared with the human genome. Using computer-generated random reads as a negative control, a set of parameters was determined for reliable interpretation of BLAST search results. About 10% of the rat reads contain regions that are conserved in the human genomic sequence and about one-third of these include known gene-coding regions. Mapping the conserved regions to human chromosomes showed a 23-fold enrichment for coding regions compared with noncoding regions. This approach can also be applied to other mammalian genomes for gene finding. These data predicted approximately 42,500 genes in the human, slightly more than reported previously.
Long Noncoding RNA LIFR-AS1: A New Player in Human Cancers.
Bai Z, Wang X, Zhang Z Biomed Res Int. 2022; 2022:1590815.
PMID: 35071590 PMC: 8776453. DOI: 10.1155/2022/1590815.
Wang Y, van der Hoeven R, Nielsen R, Mueller L, Tanksley S Theor Appl Genet. 2005; 112(1):72-84.
PMID: 16208505 DOI: 10.1007/s00122-005-0107-z.
Margulies E, Vinson J, Miller W, Jaffe D, Lindblad-Toh K, Chang J Proc Natl Acad Sci U S A. 2005; 102(13):4795-800.
PMID: 15778292 PMC: 555705. DOI: 10.1073/pnas.0409882102.
Strategies and tools for whole-genome alignments.
Couronne O, Poliakov A, Bray N, Ishkhanov T, Ryaboy D, Rubin E Genome Res. 2003; 13(1):73-80.
PMID: 12529308 PMC: 430965. DOI: 10.1101/gr.762503.
Parallel construction of orthologous sequence-ready clone contig maps in multiple species.
Thomas J, Prasad A, Summers T, Lee-Lin S, Maduro V, Idol J Genome Res. 2002; 12(8):1277-85.
PMID: 12176935 PMC: 186643. DOI: 10.1101/gr.283202.