» Articles » PMID: 21448266

Finding Complex Biological Relationships in Recent PubMed Articles Using Bio-LDA

Overview
Journal PLoS One
Date 2011 Mar 31
PMID 21448266
Citations 33
Authors
Affiliations
Soon will be listed here.
Abstract

The overwhelming amount of available scholarly literature in the life sciences poses significant challenges to scientists wishing to keep up with important developments related to their research, but also provides a useful resource for the discovery of recent information concerning genes, diseases, compounds and the interactions between them. In this paper, we describe an algorithm called Bio-LDA that uses extracted biological terminology to automatically identify latent topics, and provides a variety of measures to uncover putative relations among topics and bio-terms. Relationships identified using those approaches are combined with existing data in life science datasets to provide additional insight. Three case studies demonstrate the utility of the Bio-LDA model, including association predication, association search and connectivity map generation. This combined approach offers new opportunities for knowledge discovery in many areas of biology including target identification, lead hopping and drug repurposing.

Citing Articles

Mapping the Bibliometrics Landscape of AI in Medicine: Methodological Study.

Shi J, Bendig D, Vollmar H, Rasche P J Med Internet Res. 2023; 25:e45815.

PMID: 38064255 PMC: 10746970. DOI: 10.2196/45815.


RegenX: an NLP recommendation engine for neuroregeneration topics over time.

Khosla S, Abdelrahman L, Johnson J, Samarah M, Bhattacharya S Ann Eye Sci. 2022; 7.

PMID: 36199680 PMC: 9531894. DOI: 10.21037/aes-21-29.


Large-Scale Validation of Hypothesis Generation Systems via Candidate Ranking.

Sybrandt J, Shtutman M, Safro I Proc IEEE Int Conf Big Data. 2022; 2018:1494-1503.

PMID: 35789222 PMC: 9248026. DOI: 10.1109/bigdata.2018.8622637.


Social Media Insights Into US Mental Health During the COVID-19 Pandemic: Longitudinal Analysis of Twitter Data.

Valdez D, Ten Thij M, Bathina K, Rutter L, Bollen J J Med Internet Res. 2020; 22(12):e21418.

PMID: 33284783 PMC: 7744146. DOI: 10.2196/21418.


Methodologically grounded semantic analysis of large volume of chilean medical literature data applied to the analysis of medical research funding efficiency in Chile.

Wolff P, Rios S, Clavijo D, Grana M, Carrasco M J Biomed Semantics. 2020; 11(1):12.

PMID: 32993795 PMC: 7523397. DOI: 10.1186/s13326-020-00226-w.


References
1.
Belleau F, Nolin M, Tourigny N, Rigault P, Morissette J . Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J Biomed Inform. 2008; 41(5):706-16. DOI: 10.1016/j.jbi.2008.03.004. View

2.
Chen B, Dong X, Jiao D, Wang H, Zhu Q, Ding Y . Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systems chemical biology data. BMC Bioinformatics. 2010; 11:255. PMC: 2881087. DOI: 10.1186/1471-2105-11-255. View

3.
Blei D, Franks K, Jordan M, Mian I . Statistical modeling of biomedical corpora: mining the Caenorhabditis Genetic Center Bibliography for genes related to life span. BMC Bioinformatics. 2006; 7:250. PMC: 1533868. DOI: 10.1186/1471-2105-7-250. View

4.
Wang X, Grimson W, Westin C . Tractography segmentation using a hierarchical Dirichlet processes mixture model. Neuroimage. 2010; 54(1):290-302. PMC: 2962770. DOI: 10.1016/j.neuroimage.2010.07.050. View

5.
Wild D . Mining large heterogeneous data sets in drug discovery. Expert Opin Drug Discov. 2013; 4(10):995-1004. DOI: 10.1517/17460440903233738. View