» Articles » PMID: 21642013

Combining PubMed Knowledge and EHR Data to Develop a Weighted Bayesian Network for Pancreatic Cancer Prediction

Overview
Journal J Biomed Inform
Publisher Elsevier
Date 2011 Jun 7
PMID 21642013
Citations 42
Authors
Affiliations
Soon will be listed here.
Abstract

In this paper, we propose a novel method that combines PubMed knowledge and Electronic Health Records to develop a weighted Bayesian Network Inference (BNI) model for pancreatic cancer prediction. We selected 20 common risk factors associated with pancreatic cancer and used PubMed knowledge to weigh the risk factors. A keyword-based algorithm was developed to extract and classify PubMed abstracts into three categories that represented positive, negative, or neutral associations between each risk factor and pancreatic cancer. Then we designed a weighted BNI model by adding the normalized weights into a conventional BNI model. We used this model to extract the EHR values for patients with or without pancreatic cancer, which then enabled us to calculate the prior probabilities for the 20 risk factors in the BNI. The software iDiagnosis was designed to use this weighted BNI model for predicting pancreatic cancer. In an evaluation using a case-control dataset, the weighted BNI model significantly outperformed the conventional BNI and two other classifiers (k-Nearest Neighbor and Support Vector Machine). We conclude that the weighted BNI using PubMed knowledge and EHR data shows remarkable accuracy improvement over existing representative methods for pancreatic cancer prediction.

Citing Articles

Expert Judgment Supporting a Bayesian Network to Model the Survival of Pancreatic Cancer Patients.

Secchettin E, Paiella S, Azzolina D, Casciani F, Salvia R, Malleo G Cancers (Basel). 2025; 17(2).

PMID: 39858083 PMC: 11764457. DOI: 10.3390/cancers17020301.


Predictive risk models for COVID-19 patients using the multi-thresholding meta-algorithm.

Delgado R, Fernandez-Pelaez F, Pallares N, Diaz-Brito V, Izquierdo E, Oriol I Sci Rep. 2024; 14(1):28453.

PMID: 39557887 PMC: 11574063. DOI: 10.1038/s41598-024-77386-7.


In-hospital mortality, readmission, and prolonged length of stay risk prediction leveraging historical electronic patient records.

Bopche R, Gustad L, Afset J, Ehrnstrom B, Damas J, Nytro O JAMIA Open. 2024; 7(3):ooae074.

PMID: 39282081 PMC: 11401612. DOI: 10.1093/jamiaopen/ooae074.


Prediction of Cancer Symptom Trajectory Using Longitudinal Electronic Health Record Data and Long Short-Term Memory Neural Network.

Chae S, Street W, Ramaraju N, Gilbertson-White S JCO Clin Cancer Inform. 2024; 8:e2300039.

PMID: 38471054 PMC: 10948138. DOI: 10.1200/CCI.23.00039.


Determining the feasibility of calculating pancreatic cancer risk scores for people with new-onset diabetes in primary care (DEFEND PRIME): study protocol.

Claridge H, Price C, Ali R, Cooke E, de Lusignan S, Harvey-Sullivan A BMJ Open. 2024; 14(1):e079863.

PMID: 38262635 PMC: 10806670. DOI: 10.1136/bmjopen-2023-079863.


References
1.
Chen H, Sharp B . Content-rich biological network constructed by mining PubMed abstracts. BMC Bioinformatics. 2004; 5:147. PMC: 528731. DOI: 10.1186/1471-2105-5-147. View

2.
Stojadinovic A, Eberhardt C, Henry L, Eberhardt J, Elster E, Peoples G . Development of a Bayesian classifier for breast cancer risk stratification: a feasibility study. Eplasty. 2010; 10:e25. PMC: 2851108. View

3.
Jain N, Friedman C . Identification of findings suspicious for breast cancer based on natural language processing of mammogram reports. Proc AMIA Annu Fall Symp. 1997; :829-33. PMC: 2233320. View

4.
Gerstung M, Baudis M, Moch H, Beerenwinkel N . Quantifying cancer progression with conjunctive Bayesian networks. Bioinformatics. 2009; 25(21):2809-15. PMC: 2781752. DOI: 10.1093/bioinformatics/btp505. View

5.
Needham C, Bradford J, Bulpitt A, Westhead D . Inference in Bayesian networks. Nat Biotechnol. 2006; 24(1):51-3. DOI: 10.1038/nbt0106-51. View