» Articles » PMID: 21322080

Novel Head and Neck Cancer Survival Analysis Approach: Random Survival Forests Versus Cox Proportional Hazards Regression

Overview
Journal Head Neck
Date 2011 Feb 16
PMID 21322080
Citations 29
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Electronic patient files generate an enormous amount of medical data. These data can be used for research, such as prognostic modeling. Automatization of statistical prognostication processes allows automatic updating of models when new data is gathered. The increase of power behind an automated prognostic model makes its predictive capability more reliable. Cox proportional hazard regression is most frequently used in prognostication. Automatization of a Cox model is possible, but we expect the updating process to be time-consuming. A possible solution lies in an alternative modeling technique called random survival forests (RSFs). RSF is easily automated and is known to handle the proportionality assumption coherently and automatically. Performance of RSF has not yet been tested on a large head and neck oncological dataset. This study investigates performance of head and neck overall survival of RSF models. Performances are compared to a Cox model as the "gold standard." RSF might be an interesting alternative modeling approach for automatization when performances are similar.

Methods: RSF models were created in R (Cox also in SPSS). Four RSF splitting rules were used: log-rank, conservation of events, log-rank score, and log-rank approximation. Models were based on historical data of 1371 patients with primary head-and-neck cancer, diagnosed between 1981 and 1998. Models contain 8 covariates: tumor site, T classification, N classification, M classification, age, sex, prior malignancies, and comorbidity. Model performances were determined by Harrell's concordance error rate, in which 33% of the original data served as a validation sample.

Results: RSF and Cox models delivered similar error rates. The Cox model performed slightly better (error rate, 0.2826). The log-rank splitting approach gave the best RSF performance (error rate, 0.2873). In accord with Cox and RSF models, high T classification, high N classification, and severe comorbidity are very important covariates in the model, whereas sex, mild comorbidity, and a supraglottic larynx tumor are less important. A discrepancy arose regarding the importance of M1 classification (see Discussion).

Conclusion: Both approaches delivered similar error rates. The Cox model gives a clinically understandable output on covariate impact, whereas RSF becomes more of a "black box." RSF complements the Cox model by giving more insight and confidence toward relative importance of model covariates. RSF can be recommended as the approach of choice in automating survival analyses.

Citing Articles

Cytokine profiles as predictors of HIV incidence using machine learning survival models and statistical interpretable techniques.

Ogutu S, Mohammed M, Mwambi H Sci Rep. 2024; 14(1):29895.

PMID: 39622992 PMC: 11612445. DOI: 10.1038/s41598-024-81510-y.


Identifying Factors Affecting the Survival of Patients with HIV-Associated B-Cell Lymphoma Using a Random Survival Forest Model.

Zhao H, Zhu C, Lian Y, Cheng Y, Zhu F, Wang J Clin Med Insights Oncol. 2024; 18:11795549241260572.

PMID: 38911454 PMC: 11193342. DOI: 10.1177/11795549241260572.


Comparative evaluation of outcomes amongst different radiosurgery management paradigms for patients with large brain metastasis.

Kutuk T, Zhang Y, Akdemir E, Yarlagadda S, Tolakanahalli R, Hall M J Neurooncol. 2024; 169(1):105-117.

PMID: 38837019 DOI: 10.1007/s11060-024-04706-2.


Prediction of liver cancer prognosis based on immune cell marker genes.

Liu J, Qu J, Xu L, Qiao C, Shao G, Liu X Front Immunol. 2023; 14:1147797.

PMID: 37180166 PMC: 10174299. DOI: 10.3389/fimmu.2023.1147797.


Machine learning versus regression for prediction of sporadic pancreatic cancer.

Chen W, Zhou B, Jeon C, Xie F, Lin Y, Butler R Pancreatology. 2023; 23(4):396-402.

PMID: 37130760 PMC: 10406388. DOI: 10.1016/j.pan.2023.04.009.