» Articles » PMID: 3162770

Improved Tools for Biological Sequence Comparison

Overview
Specialty Science
Date 1988 Apr 1
PMID 3162770
Citations 3348
Authors
Affiliations
Soon will be listed here.
Abstract

We have developed three computer programs for comparisons of protein and DNA sequences. They can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity. The FASTA program is a more sensitive derivative of the FASTP program, which can be used to search protein or DNA sequence data bases and can compare a protein sequence to a DNA sequence data base by translating the DNA data base as it is searched. FASTA includes an additional step in the calculation of the initial pairwise similarity score that allows multiple regions of similarity to be joined to increase the score of related sequences. The RDF2 program can be used to evaluate the significance of similarity scores using a shuffling method that preserves local sequence composition. The LFASTA program can display all the regions of local similarity between two sequences with scores greater than a threshold, using the same scoring parameters and a similar alignment algorithm; these local similarities can be displayed as a "graphic matrix" plot or as individual alignments. In addition, these programs have been generalized to allow comparison of DNA or protein sequences based on a variety of alternative scoring matrices.

Citing Articles

LongReadSum: A fast and flexible quality control and signal summarization tool for long-read sequencing data.

Perdomo J, Ahsan M, Liu Q, Fang L, Wang K Comput Struct Biotechnol J. 2025; 27:556-563.

PMID: 39981293 PMC: 11840941. DOI: 10.1016/j.csbj.2025.01.019.


Group B growth in human urine is associated with asymptomatic bacteriuria rather than urinary tract infection and is unaffected by iron sequestration.

Ipe D, Goh K, Desai D, Ben-Zakour N, Sullivan M, Beatson S Microbiology (Reading). 2025; 171(2).

PMID: 39976609 PMC: 11842879. DOI: 10.1099/mic.0.001533.


Pervasive Mitochondrial tRNA Gene Loss in Clade B of Haplosclerid Sponges (Porifera, Demospongiae).

Lavrov D, Turner T, Vicente J Genome Biol Evol. 2025; 17(3).

PMID: 39913674 PMC: 11886574. DOI: 10.1093/gbe/evaf020.


The algebraic extended atom-type graph-based model for precise ligand-receptor binding affinity prediction.

Mukta F, Rana M, Meyer A, Ellingson S, Nguyen D J Cheminform. 2025; 17(1):10.

PMID: 39844277 PMC: 11756177. DOI: 10.1186/s13321-025-00955-z.


Taming large-scale genomic analyses via sparsified genomics.

Alser M, Eudine J, Mutlu O Nat Commun. 2025; 16(1):876.

PMID: 39837860 PMC: 11751491. DOI: 10.1038/s41467-024-55762-1.


References
1.
Sellers P . Pattern recognition in genetic sequences. Proc Natl Acad Sci U S A. 1979; 76(7):3041. PMC: 383757. DOI: 10.1073/pnas.76.7.3041. View

2.
Dumas J, Ninio J . Efficient algorithms for folding and comparing nucleic acid sequences. Nucleic Acids Res. 1982; 10(1):197-206. PMC: 326126. DOI: 10.1093/nar/10.1.197. View

3.
Wilbur W, Lipman D . Rapid similarity searches of nucleic acid and protein data banks. Proc Natl Acad Sci U S A. 1983; 80(3):726-30. PMC: 393452. DOI: 10.1073/pnas.80.3.726. View

4.
Lipman D, Pearson W . Rapid and sensitive protein similarity searches. Science. 1985; 227(4693):1435-41. DOI: 10.1126/science.2983426. View

5.
Doolittle R . Similar amino acid sequences: chance or common ancestry?. Science. 1981; 214(4517):149-59. DOI: 10.1126/science.7280687. View