» Articles » PMID: 34530437

Ensemble Modeling with Machine Learning and Deep Learning to Provide Interpretable Generalized Rules for Classifying CNS Drugs with High Prediction Power

Overview
Journal Brief Bioinform
Specialty Biology
Date 2021 Sep 16
PMID 34530437
Citations 14
Authors
Affiliations
Soon will be listed here.
Abstract

The trade-off between a machine learning (ML) and deep learning (DL) model's predictability and its interpretability has been a rising concern in central nervous system-related quantitative structure-activity relationship (CNS-QSAR) analysis. Many state-of-the-art predictive modeling failed to provide structural insights due to their black box-like nature. Lack of interpretability and further to provide easy simple rules would be challenging for CNS-QSAR models. To address these issues, we develop a protocol to combine the power of ML and DL to generate a set of simple rules that are easy to interpret with high prediction power. A data set of 940 market drugs (315 CNS-active, 625 CNS-inactive) with support vector machine and graph convolutional network algorithms were used. Individual ML/DL modeling methods were also constructed for comparison. The performance of these models was evaluated using an additional external dataset of 117 market drugs (42 CNS-active, 75 CNS-inactive). Fingerprint-split validation was adopted to ensure model stringency and generalizability. The resulting novel hybrid ensemble model outperformed other constituent traditional QSAR models with an accuracy of 0.96 and an F1 score of 0.95. With the power of the interpretability provided with this protocol, our model laid down a set of simple physicochemical rules to determine whether a compound can be a CNS drug using six sub-structural features. These rules displayed higher classification ability than classical guidelines, with higher specificity and more mechanistic insights than just for blood-brain barrier permeability. This hybrid protocol can potentially be used for other drug property predictions.

Citing Articles

The Application of Machine Learning in Predicting the Permeability of Drugs Across the Blood Brain Barrier.

Jafarpour S, Asefzadeh M, Aboutaleb E Iran J Pharm Res. 2025; 23(1):e149367.

PMID: 40066117 PMC: 11892787. DOI: 10.5812/ijpr-149367.


Overcoming Challenges in Small-Molecule Drug Bioavailability: A Review of Key Factors and Approaches.

Wu K, Kwon S, Zhou X, Fuller C, Wang X, Vadgama J Int J Mol Sci. 2024; 25(23.

PMID: 39684832 PMC: 11642056. DOI: 10.3390/ijms252313121.


Transparent Machine Learning Model to Understand Drug Permeability through the Blood-Brain Barrier.

Jia H, Sosso G J Chem Inf Model. 2024; 64(23):8718-8728.

PMID: 39558528 PMC: 11632763. DOI: 10.1021/acs.jcim.4c01217.


Identification of structural features of surface modifiers in engineered nanostructured metal oxides regarding cell uptake through ML-based classification.

Dasgupta I, Das T, Das B, Gayen S Beilstein J Nanotechnol. 2024; 15:909-924.

PMID: 39076688 PMC: 11285082. DOI: 10.3762/bjnano.15.75.


Identifying Substructures That Facilitate Compounds to Penetrate the Blood-Brain Barrier via Passive Transport Using Machine Learning Explainer Models.

Rosa L, Argolo C, Nascimento C, Pimentel A ACS Chem Neurosci. 2024; 15(11):2144-2159.

PMID: 38723285 PMC: 11157485. DOI: 10.1021/acschemneuro.3c00840.


References
1.
Zhang Y, Liu H, Summerfield S, Luscombe C, Sahi J . Integrating in Silico and in Vitro Approaches To Predict Drug Accessibility to the Central Nervous System. Mol Pharm. 2016; 13(5):1540-50. DOI: 10.1021/acs.molpharmaceut.6b00031. View

2.
Mikitsh J, Chacko A . Pathways for small molecule delivery to the central nervous system across the blood-brain barrier. Perspect Medicin Chem. 2014; 6:11-24. PMC: 4064947. DOI: 10.4137/PMC.S13384. View

3.
Menken M, Munsat T, Toole J . The global burden of disease study: implications for neurology. Arch Neurol. 2000; 57(3):418-20. DOI: 10.1001/archneur.57.3.418. View

4.
Sun S, Adejare A . Fluorinated molecules as drugs and imaging agents in the CNS. Curr Top Med Chem. 2006; 6(14):1457-64. DOI: 10.2174/156802606777951046. View

5.
Wang P, Tu Y, Tseng Y . PgpRules: a decision tree based prediction server for P-glycoprotein substrates and inhibitors. Bioinformatics. 2019; 35(20):4193-4195. DOI: 10.1093/bioinformatics/btz213. View