» Articles » PMID: 29382633

Prediction of Incident Hypertension Within the Next Year: Prospective Study Using Statewide Electronic Health Records and Machine Learning

Abstract

Background: As a high-prevalence health condition, hypertension is clinically costly, difficult to manage, and often leads to severe and life-threatening diseases such as cardiovascular disease (CVD) and stroke.

Objective: The aim of this study was to develop and validate prospectively a risk prediction model of incident essential hypertension within the following year.

Methods: Data from individual patient electronic health records (EHRs) were extracted from the Maine Health Information Exchange network. Retrospective (N=823,627, calendar year 2013) and prospective (N=680,810, calendar year 2014) cohorts were formed. A machine learning algorithm, XGBoost, was adopted in the process of feature selection and model building. It generated an ensemble of classification trees and assigned a final predictive risk score to each individual.

Results: The 1-year incident hypertension risk model attained areas under the curve (AUCs) of 0.917 and 0.870 in the retrospective and prospective cohorts, respectively. Risk scores were calculated and stratified into five risk categories, with 4526 out of 381,544 patients (1.19%) in the lowest risk category (score 0-0.05) and 21,050 out of 41,329 patients (50.93%) in the highest risk category (score 0.4-1) receiving a diagnosis of incident hypertension in the following 1 year. Type 2 diabetes, lipid disorders, CVDs, mental illness, clinical utilization indicators, and socioeconomic determinants were recognized as driving or associated features of incident essential hypertension. The very high risk population mainly comprised elderly (age>50 years) individuals with multiple chronic conditions, especially those receiving medications for mental disorders. Disparities were also found in social determinants, including some community-level factors associated with higher risk and others that were protective against hypertension.

Conclusions: With statewide EHR datasets, our study prospectively validated an accurate 1-year risk prediction model for incident essential hypertension. Our real-time predictive analytic model has been deployed in the state of Maine, providing implications in interventions for hypertension and related diseases and hopefully enhancing hypertension care.

Citing Articles

Retinal vein occlusion risk prediction without fundus examination using a no-code machine learning tool for tabular data: a nationwide cross-sectional study from South Korea.

Yu N, Shin D, Ryu I, Yoo T, Koh K BMC Med Inform Decis Mak. 2025; 25(1):118.

PMID: 40055729 PMC: 11889835. DOI: 10.1186/s12911-025-02950-8.


Trends and Gaps in Digital Precision Hypertension Management: Scoping Review.

Clifford N, Tunis R, Ariyo A, Yu H, Rhee H, Radhakrishnan K J Med Internet Res. 2025; 27:e59841.

PMID: 39928934 PMC: 11851032. DOI: 10.2196/59841.


AI Machine Learning-Based Diabetes Prediction in Older Adults in South Korea: Cross-Sectional Analysis.

Lee H, Park M, Won Y JMIR Form Res. 2025; 9:e57874.

PMID: 39838554 PMC: 11779598. DOI: 10.2196/57874.


Mitigating Data Leakage in a WiFi CSI Benchmark for Human Action Recognition.

Varga D Sensors (Basel). 2025; 24(24.

PMID: 39771935 PMC: 11679234. DOI: 10.3390/s24248201.


A machine learning tool for early identification of celiac disease autoimmunity.

Dreyfuss M, Getz B, Lebwohl B, Ramni O, Underberger D, Ber T Sci Rep. 2024; 14(1):30760.

PMID: 39730479 PMC: 11681168. DOI: 10.1038/s41598-024-80817-0.


References
1.
Plagnol V, Curtis J, Epstein M, Mok K, Stebbings E, Grigoriadou S . A robust model for read count data in exome sequencing experiments and implications for copy number variant calling. Bioinformatics. 2012; 28(21):2747-54. PMC: 3476336. DOI: 10.1093/bioinformatics/bts526. View

2.
Hersh W . Adding value to the electronic health record through secondary use of data for quality assurance, research, and surveillance. Am J Manag Care. 2007; 13(6 Part 1):277-8. View

3.
Tedla F, Brar A, Browne R, Brown C . Hypertension in chronic kidney disease: navigating the evidence. Int J Hypertens. 2011; 2011:132405. PMC: 3124254. DOI: 10.4061/2011/132405. View

4.
Goldstein B, Navar A, Pencina M, Ioannidis J . Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. J Am Med Inform Assoc. 2016; 24(1):198-208. PMC: 5201180. DOI: 10.1093/jamia/ocw042. View

5.
Chobanian A, Bakris G, Black H, Cushman W, Green L, Izzo Jr J . The Seventh Report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure: the JNC 7 report. JAMA. 2003; 289(19):2560-72. DOI: 10.1001/jama.289.19.2560. View