» Articles » PMID: 33920720

Hybrid Basketball Game Outcome Prediction Model by Integrating Data Mining Methods for the National Basketball Association

Overview
Journal Entropy (Basel)
Publisher MDPI
Date 2021 Apr 30
PMID 33920720
Citations 5
Authors
Affiliations
Soon will be listed here.
Abstract

The sports market has grown rapidly over the last several decades. Sports outcomes prediction is an attractive sports analytic challenge as it provides useful information for operations in the sports market. In this study, a hybrid basketball game outcomes prediction scheme is developed for predicting the final score of the National Basketball Association (NBA) games by integrating five data mining techniques, including extreme learning machine, multivariate adaptive regression splines, k-nearest neighbors, eXtreme gradient boosting (XGBoost), and stochastic gradient boosting. Designed features are generated by merging different game-lags information from fundamental basketball statistics and used in the proposed scheme. This study collected data from all the games of the NBA 2018-2019 seasons. There are 30 teams in the NBA and each team play 82 games per season. A total of 2460 NBA game data points were collected. Empirical results illustrated that the proposed hybrid basketball game prediction scheme achieves high prediction performance and identifies suitable game-lag information and relevant game features (statistics). Our findings suggested that a two-stage XGBoost model using four pieces of game-lags information achieves the best prediction performance among all competing models. The six designed features, including averaged defensive rebounds, averaged two-point field goal percentage, averaged free throw percentage, averaged offensive rebounds, averaged assists, and averaged three-point field goal attempts, from four game-lags have a greater effect on the prediction of final scores of NBA games than other game-lags. The findings of this study provide relevant insights and guidance for other team or individual sports outcomes prediction research.

Citing Articles

Development of sequential winning-percentage prediction model for badminton competitions: applying the expert system sequential probability ratio test.

Jo E BMC Sports Sci Med Rehabil. 2025; 17(1):48.

PMID: 40082947 PMC: 11905516. DOI: 10.1186/s13102-025-01078-6.


Machine learning-driven prediction of medical expenses in triple-vessel PCI patients using feature selection.

Chen K, Huang Y, Liu C, Li S, Chen M BMC Health Serv Res. 2025; 25(1):105.

PMID: 39833782 PMC: 11744989. DOI: 10.1186/s12913-025-12218-6.


Research on prediction and evaluation algorithm of sports athletes performance based on neural network.

Wang K, Zhu D, Chang Z, Wu Z Technol Health Care. 2024; 32(6):4869-4882.

PMID: 38848203 PMC: 11612954. DOI: 10.3233/THC-232000.


Optimization of sports effect evaluation technology from random forest algorithm and elastic network algorithm.

Wang C PLoS One. 2023; 18(10):e0292557.

PMID: 37862380 PMC: 10588863. DOI: 10.1371/journal.pone.0292557.


Enhancing Basketball Game Outcome Prediction through Fused Graph Convolutional Networks and Random Forest Algorithm.

Zhao K, Du C, Tan G Entropy (Basel). 2023; 25(5).

PMID: 37238520 PMC: 10217531. DOI: 10.3390/e25050765.


References
1.
Pollard R, Pollard G . Long-term trends in home advantage in professional team sports in North America and England (1876-2003). J Sports Sci. 2005; 23(4):337-50. DOI: 10.1080/02640410400021559. View

2.
Zhang S, Li X, Zong M, Zhu X, Wang R . Efficient kNN Classification With Different Numbers of Nearest Neighbors. IEEE Trans Neural Netw Learn Syst. 2017; 29(5):1774-1785. DOI: 10.1109/TNNLS.2017.2673241. View

3.
Pollard R . Evidence of a reduced home advantage when a team moves to a new stadium. J Sports Sci. 2002; 20(12):969-73. DOI: 10.1080/026404102321011724. View

4.
Chang C, Chen S . Developing a Novel Machine Learning-Based Classification Scheme for Predicting SPCs in Breast Cancer Survivors. Front Genet. 2019; 10:848. PMC: 6759630. DOI: 10.3389/fgene.2019.00848. View

5.
Tseng C, Lu C, Chang C, Chen G, Cheewakriangkrai C . Integration of data mining classification techniques and ensemble learning to identify risk factors and diagnose ovarian cancer recurrence. Artif Intell Med. 2017; 78:47-54. DOI: 10.1016/j.artmed.2017.06.003. View