» Articles » PMID: 32458963

Constructing Co-occurrence Network Embeddings to Assist Association Extraction for COVID-19 and Other Coronavirus Infectious Diseases

Overview
Date 2020 May 28
PMID 32458963
Citations 7
Authors
Affiliations
Soon will be listed here.
Abstract

Objective: As coronavirus disease 2019 (COVID-19) started its rapid emergence and gradually transformed into an unprecedented pandemic, the need for having a knowledge repository for the disease became crucial. To address this issue, a new COVID-19 machine-readable dataset known as the COVID-19 Open Research Dataset (CORD-19) has been released. Based on this, our objective was to build a computable co-occurrence network embeddings to assist association detection among COVID-19-related biomedical entities.

Materials And Methods: Leveraging a Linked Data version of CORD-19 (ie, CORD-19-on-FHIR), we first utilized SPARQL to extract co-occurrences among chemicals, diseases, genes, and mutations and build a co-occurrence network. We then trained the representation of the derived co-occurrence network using node2vec with 4 edge embeddings operations (L1, L2, Average, and Hadamard). Six algorithms (decision tree, logistic regression, support vector machine, random forest, naïve Bayes, and multilayer perceptron) were applied to evaluate performance on link prediction. An unsupervised learning strategy was also developed incorporating the t-SNE (t-distributed stochastic neighbor embedding) and DBSCAN (density-based spatial clustering of applications with noise) algorithms for case studies.

Results: The random forest classifier showed the best performance on link prediction across different network embeddings. For edge embeddings generated using the Average operation, random forest achieved the optimal average precision of 0.97 along with a F1 score of 0.90. For unsupervised learning, 63 clusters were formed with silhouette score of 0.128. Significant associations were detected for 5 coronavirus infectious diseases in their corresponding subgroups.

Conclusions: In this study, we constructed COVID-19-centered co-occurrence network embeddings. Results indicated that the generated embeddings were able to extract significant associations for COVID-19 and coronavirus infectious diseases.

Citing Articles

Uncovering COVID-19 transmission tree: identifying traced and untraced infections in an infection network.

Lee H, Choi H, Lee H, Lee S, Kim C Front Public Health. 2024; 12:1362823.

PMID: 38887240 PMC: 11180726. DOI: 10.3389/fpubh.2024.1362823.


SymptomGraph: Identifying Symptom Clusters from Narrative Clinical Notes using Graph Clustering.

Tahabi F, Storey S, Luo X Proc Symp Appl Comput. 2023; 2023:518-527.

PMID: 37720922 PMC: 10504685. DOI: 10.1145/3555776.3577685.


Review on the Evaluation and Development of Artificial Intelligence for COVID-19 Containment.

Hasan M, Islam M, Sadeq M, Fung W, Uddin J Sensors (Basel). 2023; 23(1).

PMID: 36617124 PMC: 9824505. DOI: 10.3390/s23010527.


Deep Denoising of Raw Biomedical Knowledge Graph From COVID-19 Literature, LitCovid, and Pubtator: Framework Development and Validation.

Jiang C, Ngo V, Chapman R, Yu Y, Liu H, Jiang G J Med Internet Res. 2022; 24(7):e38584.

PMID: 35658098 PMC: 9301549. DOI: 10.2196/38584.


A Web Application for Biomedical Text Mining of Scientific Literature Associated with Coronavirus-Related Syndromes: Coronavirus Finder.

Armenta-Medina D, Brambila-Tapia A, Miranda-Jimenez S, Rodea-Montero E Diagnostics (Basel). 2022; 12(4).

PMID: 35453935 PMC: 9028729. DOI: 10.3390/diagnostics12040887.


References
1.
Shen F, Lee Y . Knowledge Discovery from Biomedical Ontologies in Cross Domains. PLoS One. 2016; 11(8):e0160005. PMC: 4993478. DOI: 10.1371/journal.pone.0160005. View

2.
Shen F, Liu H, Sohn S, Larson D, Lee Y . Predicate Oriented Pattern Analysis for Biomedical Knowledge Discovery. Intell Inf Manag. 2017; 8(3):66-85. PMC: 5626454. DOI: 10.4236/iim.2016.83006. View

3.
Bornstain C, Azoulay E, de Lassence A, Cohen Y, Costa M, Mourvillier B . Sedation, sucralfate, and antibiotic use are potential means for protection against early-onset ventilator-associated pneumonia. Clin Infect Dis. 2004; 38(10):1401-8. DOI: 10.1086/386321. View

4.
Shen F, Peng S, Fan Y, Wen A, Liu S, Wang Y . HPO2Vec+: Leveraging heterogeneous knowledge resources to enrich node embeddings for the Human Phenotype Ontology. J Biomed Inform. 2019; 96:103246. PMC: 6710011. DOI: 10.1016/j.jbi.2019.103246. View

5.
Sheahan T, Sims A, Zhou S, Graham R, Pruijssers A, Agostini M . An orally bioavailable broad-spectrum antiviral inhibits SARS-CoV-2 in human airway epithelial cell cultures and multiple coronaviruses in mice. Sci Transl Med. 2020; 12(541). PMC: 7164393. DOI: 10.1126/scitranslmed.abb5883. View