» Articles » PMID: 30256986

Large-scale Automated Machine Reading Discovers New Cancer-driving Mechanisms

Overview
Specialty Biology
Date 2018 Sep 27
PMID 30256986
Citations 20
Authors
Affiliations
Soon will be listed here.
Abstract

PubMed, a repository and search engine for biomedical literature, now indexes >1 million articles each year. This exceeds the processing capacity of human domain experts, limiting our ability to truly understand many diseases. We present Reach, a system for automated, large-scale machine reading of biomedical papers that can extract mechanistic descriptions of biological processes with relatively high precision at high throughput. We demonstrate that combining the extracted pathway fragments with existing biological data analysis algorithms that rely on curated models helps identify and explain a large number of previously unidentified mutually exclusive altered signaling pathways in seven different cancer types. This work shows that combining human-curated 'big mechanisms' with extracted 'big data' can lead to a causal, predictive understanding of cellular processes and unlock important downstream applications.

Citing Articles

A Computational Protocol for the Knowledge-Based Assessment and Capture of Pathologies.

Page J, Moore N, Broderick G Methods Mol Biol. 2024; 2868:265-284.

PMID: 39546235 DOI: 10.1007/978-1-0716-4200-9_14.


Beyond protein lists: AI-assisted interpretation of proteomic investigations in the context of evolving scientific knowledge.

Gyori B, Vitek O Nat Methods. 2024; 21(8):1387-1389.

PMID: 39122950 DOI: 10.1038/s41592-024-02324-4.


Semantics-enabled biomedical literature analytics.

Kilicoglu H, Ensan F, McInnes B, Wang L J Biomed Inform. 2024; 150:104588.

PMID: 38244957 PMC: 11771130. DOI: 10.1016/j.jbi.2024.104588.


Leveraging Structured Biological Knowledge for Counterfactual Inference: A Case Study of Viral Pathogenesis.

Zucker J, Paneri K, Mohammad-Taheri S, Bhargava S, Kolambkar P, Bakker C IEEE Trans Big Data. 2023; 7(1):25-37.

PMID: 37981991 PMC: 8769018. DOI: 10.1109/TBDATA.2021.3050680.


Automated assembly of molecular mechanisms at scale from text mining and curated databases.

Bachman J, Gyori B, Sorger P Mol Syst Biol. 2023; 19(5):e11325.

PMID: 36938926 PMC: 10167483. DOI: 10.15252/msb.202211325.


References
1.
Demir E, Cary M, Paley S, Fukuda K, Lemer C, Vastrik I . The BioPAX community standard for pathway data sharing. Nat Biotechnol. 2010; 28(9):935-42. PMC: 3001121. DOI: 10.1038/nbt.1666. View

2.
Hirschman L, Yeh A, Blaschke C, Valencia A . Overview of BioCreAtIvE: critical assessment of information extraction for biology. BMC Bioinformatics. 2005; 6 Suppl 1:S1. PMC: 1869002. DOI: 10.1186/1471-2105-6-S1-S1. View

3.
Babur O, Ngo A, Rigg R, Pang J, Rub Z, Buchanan A . Platelet procoagulant phenotype is modulated by a p38-MK2 axis that regulates RTN4/Nogo proximal to the endoplasmic reticulum: utility of pathway analysis. Am J Physiol Cell Physiol. 2018; 314(5):C603-C615. PMC: 6008067. DOI: 10.1152/ajpcell.00177.2017. View

4.
Wu G, Feng X, Stein L . A human functional protein interaction network and its application to cancer data analysis. Genome Biol. 2010; 11(5):R53. PMC: 2898064. DOI: 10.1186/gb-2010-11-5-r53. View

5.
McClosky D, Riedel S, Surdeanu M, McCallum A, Manning C . Combining joint models for biomedical event extraction. BMC Bioinformatics. 2012; 13 Suppl 11:S9. PMC: 3395172. DOI: 10.1186/1471-2105-13-S11-S9. View