Active Learning with Support Vector Machine Applied to Gene Expression Data for Cancer Classification
Overview
Medical Informatics
Authors
Affiliations
There is growing interest in the application of machine learning techniques in bioinformatics. The supervised machine learning approach has been widely applied to bioinformatics and gained a lot of success in this research area. With this learning approach researchers first develop a large training set, which is a time-consuming and costly process. Moreover, the proportion of the positive examples and negative examples in the training set may not represent the real-world data distribution, which causes concept drift. Active learning avoids these problems. Unlike most conventional learning methods where the training set used to derive the model remains static, the classifier can actively choose the training data and the size of training set increases. We introduced an algorithm for performing active learning with support vector machine and applied the algorithm to gene expression profiles of colon cancer, lung cancer, and prostate cancer samples. We compared the classification performance of active learning with that of passive learning. The results showed that employing the active learning method can achieve high accuracy and significantly reduce the need for labeled training instances. For lung cancer classification, to achieve 96% of the total positives, only 31 labeled examples were needed in active learning whereas in passive learning 174 labeled examples were required. That meant over 82% reduction was realized by active learning. In active learning the areas under the receiver operating characteristic (ROC) curves were over 0.81, while in passive learning the areas under the ROC curves were below 0.50.
Refining Brain Stimulation Therapies: An Active Learning Approach to Personalization.
Sendi M, Cole E, Piallat B, Ellis C, Eggers T, Laxpati N bioRxiv. 2024; .
PMID: 39282412 PMC: 11398352. DOI: 10.1101/2024.09.02.610880.
Refining Brain Stimulation Therapies: An Active Learning Approach to Personalization.
Sendi M, Cole E, Piallat B, Ellis C, Eggers T, Laxpati N Res Sq. 2024; .
PMID: 39281886 PMC: 11398577. DOI: 10.21203/rs.3.rs-4876094/v1.
Real-Time Tissue Classification Using a Novel Optical Needle Probe for Biopsy.
Surazynski L, Hassinen V, Nieminen M, Seppanen T, Myllyla T Appl Spectrosc. 2024; 78(5):477-485.
PMID: 38373402 PMC: 11070118. DOI: 10.1177/00037028241230568.
Musa I, Afolabi L, Zamit I, Musa T, Musa H, Tassang A Cancer Control. 2022; 29:10732748221095946.
PMID: 35688650 PMC: 9189515. DOI: 10.1177/10732748221095946.
Labels in a haystack: Approaches beyond supervised learning in biomedical applications.
Yakimovich A, Beaugnon A, Huang Y, Ozkirimli E Patterns (N Y). 2021; 2(12):100383.
PMID: 34950904 PMC: 8672145. DOI: 10.1016/j.patter.2021.100383.