» Articles » PMID: 28358836

Selection of Key Sequence-based Features for Prediction of Essential Genes in 31 Diverse Bacterial Species

Overview
Journal PLoS One
Date 2017 Mar 31
PMID 28358836
Citations 13
Authors
Affiliations
Soon will be listed here.
Abstract

Genes that are indispensable for survival are essential genes. Many features have been proposed for computational prediction of essential genes. In this paper, the least absolute shrinkage and selection operator method was used to screen key sequence-based features related to gene essentiality. To assess the effects, the selected features were used to predict the essential genes from 31 bacterial species based on a support vector machine classifier. For all 31 bacterial objects (21 Gram-negative objects and ten Gram-positive objects), the features in the three datasets were reduced from 57, 59, and 58, to 40, 37, and 38, respectively, without loss of prediction accuracy. Results showed that some features were redundant for gene essentiality, so could be eliminated from future analyses. The selected features contained more complex (or key) biological information for gene essentiality, and could be of use in related research projects, such as gene prediction, synthetic biology, and drug design.

Citing Articles

Recent advances in the characterization of essential genes and development of a database of essential genes.

Liang Y, Luo H, Lin Y, Gao F Imeta. 2024; 3(1):e157.

PMID: 38868518 PMC: 10989110. DOI: 10.1002/imt2.157.


Evaluation of machine learning classifiers for predicting essential genes in strains.

Mukul Das M, Sarkar K Bioinformation. 2023; 18(12):1126-1130.

PMID: 37701504 PMC: 10492903. DOI: 10.6026/973206300181126.


Identification of discriminant features from stationary pattern of nucleotide bases and their application to essential gene classification.

Rout R, Umer S, Khandelwal M, Pati S, Mallik S, Balabantaray B Front Genet. 2023; 14:1154120.

PMID: 37152988 PMC: 10156977. DOI: 10.3389/fgene.2023.1154120.


Bacterial genome reductions: Tools, applications, and challenges.

LeBlanc N, Charles T Front Genome Ed. 2022; 4:957289.

PMID: 36120530 PMC: 9473318. DOI: 10.3389/fgeed.2022.957289.


NetGenes: A Database of Essential Genes Predicted Using Features From Interaction Networks.

Senthamizhan V, Ravindran B, Raman K Front Genet. 2021; 12:722198.

PMID: 34630517 PMC: 8495214. DOI: 10.3389/fgene.2021.722198.


References
1.
KROGH A, Larsson B, von Heijne G, Sonnhammer E . Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol. 2001; 305(3):567-80. DOI: 10.1006/jmbi.2000.4315. View

2.
Juhas M, Eberl L, Church G . Essential genes as antimicrobial targets and cornerstones of synthetic biology. Trends Biotechnol. 2012; 30(11):601-7. DOI: 10.1016/j.tibtech.2012.08.002. View

3.
Huang T, Gong H, Yang C, He Z . ProteinLasso: A Lasso regression approach to protein inference problem in shotgun proteomics. Comput Biol Chem. 2013; 43:46-54. DOI: 10.1016/j.compbiolchem.2012.12.008. View

4.
Li Z, Sillanpaa M . Overview of LASSO-related penalized regression methods for quantitative trait mapping and genomic selection. Theor Appl Genet. 2012; 125(3):419-35. DOI: 10.1007/s00122-012-1892-9. View

5.
Seringhaus M, Paccanaro A, Borneman A, Snyder M, Gerstein M . Predicting essential genes in fungal genomes. Genome Res. 2006; 16(9):1126-35. PMC: 1557763. DOI: 10.1101/gr.5144106. View