Chemical Data Mining of the NCI Human Tumor Cell Line Database
Overview
Medical Informatics
Authors
Affiliations
The NCI Developmental Therapeutics Program Human Tumor cell line data set is a publicly available database that contains cellular assay screening data for over 40 000 compounds tested in 60 human tumor cell lines. The database also contains microarray assay gene expression data for the cell lines, and so it provides an excellent information resource particularly for testing data mining methods that bridge chemical, biological, and genomic information. In this paper we describe a formal knowledge discovery approach to characterizing and data mining this set and report the results of some of our initial experiments in mining the set from a chemoinformatics perspective.
Paul A, Anand R, Porey Karmakar S, Rawat S, Bairagi N, Chatterjee S Sci Rep. 2021; 11(1):213.
PMID: 33420254 PMC: 7794450. DOI: 10.1038/s41598-020-80561-1.
Yao X, Watkins N, Brown-Harding H, Bierbach U Sci Rep. 2020; 10(1):15201.
PMID: 32939009 PMC: 7494928. DOI: 10.1038/s41598-020-72099-z.
Fast rule-based bioactivity prediction using associative classification mining.
Yu P, Wild D J Cheminform. 2012; 4(1):29.
PMID: 23176548 PMC: 3515428. DOI: 10.1186/1758-2946-4-29.
Cheng T, Wang Y, Bryant S Bioinformatics. 2010; 26(22):2881-8.
PMID: 20947527 PMC: 2971579. DOI: 10.1093/bioinformatics/btq550.
Zhu Q, Lajiness M, Ding Y, Wild D J Cheminform. 2010; 2:6.
PMID: 20727184 PMC: 2933596. DOI: 10.1186/1758-2946-2-6.