XGBMUT: Predicting the Functional Impact of Missense Mutations Using an Extreme Gradient Boost Classifier
Overview
Authors
Affiliations
Millions of new mutations have been discovered largely due to advancements in genome projects, but characterizing their effects through traditional wet-lab experiments remains labor-intensive and time-consuming. Functional prediction algorithms offer a solution by enabling the efficient screening of mutations, thereby saving time and resources. The objective of this study was to develop a competitive algorithm for predicting the functional impact of missense mutations. A unified database and substitution matrices containing predictor variables specifically for missense mutations were initially constructed. Subsequently, values for the predictor variables were collected from the training and test sets derived from the ClinVar and HumsaVar databases. A series of supervised machine learning classifiers were then trained, and their performance was evaluated using the test set. The best-performing model was additionally compared against ten currently available functional prediction algorithms. The proposed algorithm, XGBMut, demonstrates exceptional accuracy in classifying missense mutations while also exhibiting a competitive performance. Additionally, a user-friendly graphical interface was developed to enhance accessibility for professionals in various fields. Unlike most existing methods, XGBMut eliminates the need for a web server dependency and the installation of third-party software, making it a more versatile tool for users.