» Articles » PMID: 17028096

MetaGene: Prokaryotic Gene Finding from Environmental Genome Shotgun Sequences

Overview
Specialty Biochemistry
Date 2006 Oct 10
PMID 17028096
Citations 362
Authors
Affiliations
Soon will be listed here.
Abstract

Exhaustive gene identification is a fundamental goal in all metagenomics projects. However, most metagenomic sequences are unassembled anonymous fragments, and conventional gene-finding methods cannot be applied. We have developed a prokaryotic gene-finding program, MetaGene, which utilizes di-codon frequencies estimated by the GC content of a given sequence with other various measures. MetaGene can predict a whole range of prokaryotic genes based on the anonymous genomic sequences of a few hundred bases, with a sensitivity of 95% and a specificity of 90% for artificial shotgun sequences (700 bp fragments from 12 species). MetaGene has two sets of codon frequency interpolations, one for bacteria and one for archaea, and automatically selects the proper set for a given sequence using the domain classification method we propose. The domain classification works properly, correctly assigning domain information to more than 90% of the artificial shotgun sequences. Applied to the Sargasso Sea dataset, MetaGene predicted almost all of the annotated genes and a notable number of novel genes. MetaGene can be applied to wide variety of metagenomic projects and expands the utility of metagenomics.

Citing Articles

Characterization and Optimization of Cellulose-Degrading Bacteria Isolated from Fecal Samples of Through Response Surface Methodology.

Wu H, Shi C, Xu T, Dai X, Zhao D Microorganisms. 2025; 13(2).

PMID: 40005715 PMC: 11858180. DOI: 10.3390/microorganisms13020348.


Characterization and comparison of gut microbiota in patients with acute pancreatitis by metagenomics and culturomics.

Gong L, Li X, Ji L, Chen G, Han Z, Su L Heliyon. 2025; 11(3):e42243.

PMID: 39931490 PMC: 11808722. DOI: 10.1016/j.heliyon.2025.e42243.


How do various strategies for returning residues change microbiota modulation: potential implications for soil health.

Jiang N, Chen Z, Ren Y, Xie S, Yao Z, Jiang D Front Microbiol. 2025; 15:1495682.

PMID: 39906540 PMC: 11790580. DOI: 10.3389/fmicb.2024.1495682.


Interactions between gut microbes and host promote degradation of various fiber components in Meishan pigs.

Pu G, Hou L, Zhao Q, Liu G, Wang Z, Zhou W mSystems. 2025; 10(2):e0150024.

PMID: 39873521 PMC: 11834408. DOI: 10.1128/msystems.01500-24.


Effects of two strains isolated from different sources on the growth of .

Wang B, Tan S, Wu M, Feng Y, Yan W, Yun Q Front Microbiol. 2024; 15:1504660.

PMID: 39717271 PMC: 11663850. DOI: 10.3389/fmicb.2024.1504660.


References
1.
Bult C, White O, Olsen G, Zhou L, Fleischmann R, Sutton G . Complete genome sequence of the methanogenic archaeon, Methanococcus jannaschii. Science. 1996; 273(5278):1058-73. DOI: 10.1126/science.273.5278.1058. View

2.
Hayes W, Borodovsky M . How to interpret an anonymous bacterial genome: machine learning approach to gene identification. Genome Res. 1998; 8(11):1154-71. DOI: 10.1101/gr.8.11.1154. View

3.
Besemer J, Lomsadze A, Borodovsky M . GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. Nucleic Acids Res. 2001; 29(12):2607-18. PMC: 55746. DOI: 10.1093/nar/29.12.2607. View

4.
Noonan J, Hofreiter M, Smith D, Priest J, Rohland N, Rabeder G . Genomic sequencing of Pleistocene cave bears. Science. 2005; 309(5734):597-9. DOI: 10.1126/science.1113485. View

5.
Hugenholtz P . Exploring prokaryotic diversity in the genomic era. Genome Biol. 2002; 3(2):REVIEWS0003. PMC: 139013. DOI: 10.1186/gb-2002-3-2-reviews0003. View