» Articles » PMID: 18588283

Data Mining the NCI60 to Predict Generalized Cytotoxicity

Overview
Date 2008 Jul 1
PMID 18588283
Citations 17
Authors
Affiliations
Soon will be listed here.
Abstract

Elimination of cytotoxic compounds in the early and later stages of drug discovery can help reduce the costs of research and development. Through the application of principal components analysis (PCA), we were able to data mine and prove that approximately 89% of the total log GI 50 variance is due to the nonspecific cytotoxic nature of substances. Furthermore, PCA led to the identification of groups of structurally unrelated substances showing very specific toxicity profiles, such as a set of 45 substances toxic only to the Leukemia_SR cancer cell line. In an effort to predict nonspecific cytotoxicity on the basis of the mean log GI 50, we created a decision tree using MACCS keys that can correctly classify over 83% of the substances as cytotoxic/noncytotoxic in silico, on the basis of the cutoff of mean log GI 50 = -5.0. Finally, we have established a linear model using least-squares in which nine of the 59 available NCI60 cancer cell lines can be used to predict the mean log GI 50. The model has R (2) = 0.99 and a root-mean-square deviation between the observed and calculated mean log GI 50 (RMSE) = 0.09. Our predictive models can be applied to flag generally cytotoxic molecules in virtual and real chemical libraries, thus saving time and effort.

Citing Articles

LC-MS based cell metabolic profiling of tumor cells: a new predictive method for research on the mechanism of action of anticancer candidates.

Wang H, Hu J, Liu C, Liu M, Liu Z, Sun L RSC Adv. 2022; 8(30):16645-16656.

PMID: 35540548 PMC: 9080298. DOI: 10.1039/c8ra00242h.


Exploring gene knockout strategies to identify potential drug targets using genome-scale metabolic models.

Paul A, Anand R, Porey Karmakar S, Rawat S, Bairagi N, Chatterjee S Sci Rep. 2021; 11(1):213.

PMID: 33420254 PMC: 7794450. DOI: 10.1038/s41598-020-80561-1.


Comparing Multiple Machine Learning Algorithms and Metrics for Estrogen Receptor Binding Prediction.

Russo D, Zorn K, Clark A, Zhu H, Ekins S Mol Pharm. 2018; 15(10):4361-4370.

PMID: 30114914 PMC: 6181119. DOI: 10.1021/acs.molpharmaceut.8b00546.


Modelling compound cytotoxicity using conformal prediction and PubChem HTS data.

Svensson F, Norinder U, Bender A Toxicol Res (Camb). 2018; 6(1):73-80.

PMID: 30090478 PMC: 6061930. DOI: 10.1039/c6tx00252h.


Naïve Bayesian Models for Vero Cell Cytotoxicity.

Perryman A, Patel J, Russo R, Singleton E, Connell N, Ekins S Pharm Res. 2018; 35(9):170.

PMID: 29959603 PMC: 7768703. DOI: 10.1007/s11095-018-2439-9.


References
1.
Molnar L, Keseru G, Papp A, Lorincz Z, Ambrus G, Darvas F . A neural network based classification scheme for cytotoxicity predictions:Validation on 30,000 compounds. Bioorg Med Chem Lett. 2005; 16(4):1037-9. DOI: 10.1016/j.bmcl.2005.10.079. View

2.
Austin C, Brady L, Insel T, Collins F . NIH Molecular Libraries Initiative. Science. 2004; 306(5699):1138-9. DOI: 10.1126/science.1105511. View

3.
Berkowitz B, Sachs G . Life cycle of a block buster drug: discovery and development of omeprazole (Prilosec). Mol Interv. 2004; 2(1):6-11. DOI: 10.1124/mi.2.1.6. View

4.
Wang H, Klinginsmith J, Dong X, Lee A, Guha R, Wu Y . Chemical data mining of the NCI human tumor cell line database. J Chem Inf Model. 2007; 47(6):2063-76. DOI: 10.1021/ci700141x. View

5.
Guha R . Flexible Web service infrastructure for the development and deployment of predictive models. J Chem Inf Model. 2008; 48(2):456-64. DOI: 10.1021/ci700188u. View