» Articles » PMID: 37511247

Recent Advances in Machine-Learning-Based Chemoinformatics: A Comprehensive Review

Overview
Journal Int J Mol Sci
Publisher MDPI
Date 2023 Jul 29
PMID 37511247
Authors
Affiliations
Soon will be listed here.
Abstract

In modern drug discovery, the combination of chemoinformatics and quantitative structure-activity relationship (QSAR) modeling has emerged as a formidable alliance, enabling researchers to harness the vast potential of machine learning (ML) techniques for predictive molecular design and analysis. This review delves into the fundamental aspects of chemoinformatics, elucidating the intricate nature of chemical data and the crucial role of molecular descriptors in unveiling the underlying molecular properties. Molecular descriptors, including 2D fingerprints and topological indices, in conjunction with the structure-activity relationships (SARs), are pivotal in unlocking the pathway to small-molecule drug discovery. Technical intricacies of developing robust ML-QSAR models, including feature selection, model validation, and performance evaluation, are discussed herewith. Various ML algorithms, such as regression analysis and support vector machines, are showcased in the text for their ability to predict and comprehend the relationships between molecular structures and biological activities. This review serves as a comprehensive guide for researchers, providing an understanding of the synergy between chemoinformatics, QSAR, and ML. Due to embracing these cutting-edge technologies, predictive molecular analysis holds promise for expediting the discovery of novel therapeutic agents in the pharmaceutical sciences.

Citing Articles

Advancements and Applications of Artificial Intelligence in Pharmaceutical Sciences: A Comprehensive Review.

Mottaghi-Dastjerdi N, Soltany-Rezaee-Rad M Iran J Pharm Res. 2025; 23(1):e150510.

PMID: 39895671 PMC: 11787549. DOI: 10.5812/ijpr-150510.


Harnessing the AI/ML in Drug and Biological Products Discovery and Development: The Regulatory Perspective.

Mirakhori F, Niazi S Pharmaceuticals (Basel). 2025; 18(1).

PMID: 39861110 PMC: 11769376. DOI: 10.3390/ph18010047.


A review of large language models and autonomous agents in chemistry.

Ramos M, Collison C, White A Chem Sci. 2025; 16(6):2514-2572.

PMID: 39829984 PMC: 11739813. DOI: 10.1039/d4sc03921a.


Machine learning and molecular docking prediction of potential inhibitors against dengue virus.

Hanson G, Adams J, Kepgang D, Zondagh L, Tem Bueh L, Asante A Front Chem. 2025; 12():1510029.

PMID: 39776767 PMC: 11703810. DOI: 10.3389/fchem.2024.1510029.


AI and ML-based risk assessment of chemicals: predicting carcinogenic risk from chemical-induced genomic instability.

Singh A, Bhardwaj P, Laux P, Pradeep P, Busse M, Luch A Front Toxicol. 2024; 6:1461587.

PMID: 39659701 PMC: 11628524. DOI: 10.3389/ftox.2024.1461587.


References
1.
Alvarsson J, Lampa S, Schaal W, Andersson C, Wikberg J, Spjuth O . Large-scale ligand-based predictive modelling using support vector machines. J Cheminform. 2016; 8:39. PMC: 4980776. DOI: 10.1186/s13321-016-0151-5. View

2.
Ash J, Fourches D . Characterizing the Chemical Space of ERK2 Kinase Inhibitors Using Descriptors Computed from Molecular Dynamics Trajectories. J Chem Inf Model. 2017; 57(6):1286-1299. DOI: 10.1021/acs.jcim.7b00048. View

3.
Tropsha A . Best Practices for QSAR Model Development, Validation, and Exploitation. Mol Inform. 2016; 29(6-7):476-88. DOI: 10.1002/minf.201000061. View

4.
Rahman R, Dhruba S, Ghosh S, Pal R . Functional random forest with applications in dose-response predictions. Sci Rep. 2019; 9(1):1628. PMC: 6367407. DOI: 10.1038/s41598-018-38231-w. View

5.
Jenkins J, Glick M, Davies J . A 3D similarity method for scaffold hopping from known drugs or natural ligands to new chemotypes. J Med Chem. 2004; 47(25):6144-59. DOI: 10.1021/jm049654z. View