» Articles » PMID: 20936065

Biomedical Text Summarization to Support Genetic Database Curation: Using Semantic MEDLINE to Create a Secondary Database of Genetic Information

Overview
Date 2010 Oct 12
PMID 20936065
Citations 9
Authors
Affiliations
Soon will be listed here.
Abstract

Objective: This paper examines the development and evaluation of an automatic summarization system in the domain of molecular genetics. The system is a potential component of an advanced biomedical information management application called Semantic MEDLINE and could assist librarians in developing secondary databases of genetic information extracted from the primary literature.

Methods: An existing summarization system was modified for identifying biomedical text relevant to the genetic etiology of disease. The summarization system was evaluated on the task of identifying data describing genes associated with bladder cancer in MEDLINE citations. A gold standard was produced using records from Genetics Home Reference and Online Mendelian Inheritance in Man. Genes in text found by the system were compared to the gold standard. Recall, precision, and F-measure were calculated.

Results: The system achieved recall of 46%, and precision of 88% (F-measure=0.61) by taking Gene References into Function (GeneRIFs) into account.

Conclusion: The new summarization schema for genetic etiology has potential as a component in Semantic MEDLINE to support the work of data curators.

Citing Articles

A semantic relationship mining method among disorders, genes, and drugs from different biomedical datasets.

Zhang L, Hu J, Xu Q, Li F, Rao G, Tao C BMC Med Inform Decis Mak. 2020; 20(Suppl 4):283.

PMID: 33317518 PMC: 7734713. DOI: 10.1186/s12911-020-01274-z.


Figure-associated text summarization and evaluation.

Polepalli Ramesh B, Sethi R, Yu H PLoS One. 2015; 10(2):e0115671.

PMID: 25643357 PMC: 4313946. DOI: 10.1371/journal.pone.0115671.


A Framework of Knowledge Integration and Discovery for Supporting Pharmacogenomics Target Predication of Adverse Drug Events: A Case Study of Drug-Induced Long QT Syndrome.

Jiang G, Wang C, Zhu Q, Chute C AMIA Jt Summits Transl Sci Proc. 2013; 2013:88-92.

PMID: 24303306 PMC: 3814489.


Using SemRep to label semantic relations extracted from clinical text.

Liu Y, Bill R, Fiszman M, Rindflesch T, Pedersen T, Melton G AMIA Annu Symp Proc. 2013; 2012:587-95.

PMID: 23304331 PMC: 3540517.


Text summarization as a decision support aid.

Workman T, Fiszman M, Hurdle J BMC Med Inform Decis Mak. 2012; 12:41.

PMID: 22621674 PMC: 3461485. DOI: 10.1186/1472-6947-12-41.


References
1.
Ahlers C, Fiszman M, Demner-Fushman D, Lang F, Rindflesch T . Extracting semantic predications from Medline citations for pharmacogenomics. Pac Symp Biocomput. 2007; :209-20. View

2.
Jensen L, Saric J, Bork P . Literature mining for the biologist: from information retrieval to biological discovery. Nat Rev Genet. 2006; 7(2):119-29. DOI: 10.1038/nrg1768. View

3.
Peri S, Navarro J, Amanchy R, Kristiansen T, Jonnalagadda C, Surendranath V . Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res. 2003; 13(10):2363-71. PMC: 403728. DOI: 10.1101/gr.1680803. View

4.
Lindberg D, Humphreys B, McCray A . The Unified Medical Language System. Methods Inf Med. 1993; 32(4):281-91. PMC: 6693515. View

5.
Mitchell J, McCray A . The Genetics Home Reference: a new NLM consumer health resource. AMIA Annu Symp Proc. 2004; :936. PMC: 1480143. View