» Articles » PMID: 21816105

Random Forests for Verbal Autopsy Analysis: Multisite Validation Study Using Clinical Diagnostic Gold Standards

Overview
Publisher Biomed Central
Specialty Public Health
Date 2011 Aug 6
PMID 21816105
Citations 52
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Computer-coded verbal autopsy (CCVA) is a promising alternative to the standard approach of physician-certified verbal autopsy (PCVA), because of its high speed, low cost, and reliability. This study introduces a new CCVA technique and validates its performance using defined clinical diagnostic criteria as a gold standard for a multisite sample of 12,542 verbal autopsies (VAs).

Methods: The Random Forest (RF) Method from machine learning (ML) was adapted to predict cause of death by training random forests to distinguish between each pair of causes, and then combining the results through a novel ranking technique. We assessed quality of the new method at the individual level using chance-corrected concordance and at the population level using cause-specific mortality fraction (CSMF) accuracy as well as linear regression. We also compared the quality of RF to PCVA for all of these metrics. We performed this analysis separately for adult, child, and neonatal VAs. We also assessed the variation in performance with and without household recall of health care experience (HCE).

Results: For all metrics, for all settings, RF was as good as or better than PCVA, with the exception of a nonsignificantly lower CSMF accuracy for neonates with HCE information. With HCE, the chance-corrected concordance of RF was 3.4 percentage points higher for adults, 3.2 percentage points higher for children, and 1.6 percentage points higher for neonates. The CSMF accuracy was 0.097 higher for adults, 0.097 higher for children, and 0.007 lower for neonates. Without HCE, the chance-corrected concordance of RF was 8.1 percentage points higher than PCVA for adults, 10.2 percentage points higher for children, and 5.9 percentage points higher for neonates. The CSMF accuracy was higher for RF by 0.102 for adults, 0.131 for children, and 0.025 for neonates.

Conclusions: We found that our RF Method outperformed the PCVA method in terms of chance-corrected concordance and CSMF accuracy for adult and child VA with and without HCE and for neonatal VA without HCE. It is also preferable to PCVA in terms of time and cost. Therefore, we recommend it as the technique of choice for analyzing past and current verbal autopsies.

Citing Articles

Private sector delivery of care for maternal and newborn health: trends over a decade in the Indian state of Bihar.

Kumar G, George S, Majumder M, Dora S, Akbar M, Mahapatra T BMC Med. 2025; 23(1):50.

PMID: 39875877 PMC: 11776212. DOI: 10.1186/s12916-025-03894-6.


BAYESIAN NESTED LATENT CLASS MODELS FOR CAUSE-OF-DEATH ASSIGNMENT USING VERBAL AUTOPSIES ACROSS MULTIPLE DOMAINS.

Li Z, Wu Z, Chen I, Clark S Ann Appl Stat. 2024; 18(2):1137-1159.

PMID: 39421458 PMC: 11484295. DOI: 10.1214/23-aoas1826.


The openVA Toolkit for Verbal Autopsies.

Li Z, Thomas J, Choi E, McCormick T, Clark S R J. 2023; 14(4):316-334.

PMID: 37974934 PMC: 10653343. DOI: 10.32614/rj-2023-020.


An Artificial Intelligence Model for Predicting Trauma Mortality Among Emergency Department Patients in South Korea: Retrospective Cohort Study.

Lee S, Kang W, Kim D, Seo S, Kim J, Jeong S J Med Internet Res. 2023; 25:e49283.

PMID: 37642984 PMC: 10498319. DOI: 10.2196/49283.


Pediatric Injury Surveillance From Uncoded Emergency Department Admission Records in Italy: Machine Learning-Based Text-Mining Approach.

Azzolina D, Bressan S, Lorenzoni G, Baldan G, Bartolotta P, Scognamiglio F JMIR Public Health Surveill. 2023; 9:e44467.

PMID: 37436799 PMC: 10372563. DOI: 10.2196/44467.


References
1.
Murray C, Lopez A, Black R, Ahuja R, Ali S, Baqui A . Population Health Metrics Research Consortium gold standard verbal autopsy validation study: design, implementation, and development of analysis datasets. Popul Health Metr. 2011; 9:27. PMC: 3160920. DOI: 10.1186/1478-7954-9-27. View

2.
Soleman N, Chandramohan D, Shibuya K . Verbal autopsy: current practices and challenges. Bull World Health Organ. 2006; 84(3):239-45. PMC: 2627297. DOI: 10.2471/blt.05.027003. View

3.
Diaz-Uriarte R, de Andres S . Gene selection and classification of microarray data using random forest. BMC Bioinformatics. 2006; 7:3. PMC: 1363357. DOI: 10.1186/1471-2105-7-3. View

4.
Murray C, Lozano R, Flaxman A, Vahdatpour A, Lopez A . Robust metrics for assessing the performance of different verbal autopsy cause assignment methods in validation studies. Popul Health Metr. 2011; 9:28. PMC: 3160921. DOI: 10.1186/1478-7954-9-28. View

5.
Lozano R, Lopez A, Atkinson C, Naghavi M, Flaxman A, Murray C . Performance of physician-certified verbal autopsies: multisite validation study using clinical diagnostic gold standards. Popul Health Metr. 2011; 9:32. PMC: 3160925. DOI: 10.1186/1478-7954-9-32. View