» Articles » PMID: 10487860

Evaluation of Human-readable Annotation in Biomolecular Sequence Databases with Biological Rule Libraries

Overview
Journal Bioinformatics
Specialty Biology
Date 1999 Sep 17
PMID 10487860
Citations 11
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: Computer-based selection of entries from sequence databases with respect to a related functional description, e.g. with respect to a common cellular localization or contributing to the same phenotypic function, is a difficult task. Automatic semantic analysis of annotations is not only hampered by incomplete functional assignments. A major problem is that annotations are written in a rich, non-formalized language and are meant for reading by a human expert. This person can extract from the text considerably more information than is immediately apparent due to his extended biological background knowledge and logical reasoning.

Approach: A technique of automated annotation evaluation based on a combination of lexical analysis and the usage of biological rule libraries has been developed. The proposed algorithm generates new functional descriptors from the annotation of a given entry using the semantic units of the annotation as prepositions for implications executed in accordance with the rule library.

Results: The prototype of a software system, the Meta_A(nnotator) program, is described and the results of its application to sequence attribute assignment and sequence selection problems, such as cellular localization and sequence domain annotation of SWISS-PROT entries, are presented. The current software version assigns useful subcellular localization qualifiers to approximately 88% of all SWISS-PROT entries. As shown by demonstrative examples, the combination of sequence and annotation analysis is a powerful approach for the detection of mutual annotation/sequence inconsistencies.

Availability: Results for the cellular localization assignment can be viewed at the URL http://www.bork. embl-heidelberg.de/CELL_LOC/CELL_LOC.html.

Citing Articles

Did the early full genome sequencing of yeast boost gene function discovery?.

Tantoso E, Eisenhaber B, Sinha S, Jensen L, Eisenhaber F Biol Direct. 2023; 18(1):46.

PMID: 37574542 PMC: 10424406. DOI: 10.1186/s13062-023-00403-8.


About the dark corners in the gene function space of Escherichia coli remaining without illumination by scientific literature.

Tantoso E, Eisenhaber B, Sinha S, Jensen L, Eisenhaber F Biol Direct. 2023; 18(1):7.

PMID: 36855185 PMC: 9976479. DOI: 10.1186/s13062-023-00362-0.


Darkness in the Human Gene and Protein Function Space: Widely Modest or Absent Illumination by the Life Science Literature and the Trend for Fewer Protein Function Discoveries Since 2000.

Sinha S, Eisenhaber B, Jensen L, Kalbuaji B, Eisenhaber F Proteomics. 2018; 18(21-22):e1800093.

PMID: 30265449 PMC: 6282819. DOI: 10.1002/pmic.201800093.


Can inferred provenance and its visualisation be used to detect erroneous annotation? A case study using UniProtKB.

Bell M, Collison M, Lord P PLoS One. 2013; 8(10):e75541.

PMID: 24143170 PMC: 3797126. DOI: 10.1371/journal.pone.0075541.


Amplification of the Gene Ontology annotation of Affymetrix probe sets.

Muro E, Perez-Iratxeta C, Andrade-Navarro M BMC Bioinformatics. 2006; 7:159.

PMID: 16549014 PMC: 1435773. DOI: 10.1186/1471-2105-7-159.