Database Searching Using Mass Spectrometry Data
Overview
Authors
Affiliations
Large-scale DNA sequencing is creating a sequence infrastructure of great benefit to protein biochemistry. Concurrent with the application of large-scale DNA sequencing to whole genome analysis, mass spectrometry has attained the capability to rapidly, and with remarkable sensitivity, determine weights and amino acid sequences of peptides. Computer algorithms have been developed to use the two different types of data generated by mass spectrometers to search sequence databases. When a protein is digested with a site-specific protease, the molecular weights of the resulting collection of peptides, the mass map or fingerprint, can be determined using mass spectrometry. The molecular weights of the set of peptides derived from the digestion of a protein can then be used to identify the protein. Several different approaches have been developed. Protein identification using peptide mass mapping is an effective technique when studying organisms with completed genomes. A second method is based on the use of data created by tandem mass spectrometers. Tandem mass spectra contain highly specific information in the fragmentation pattern as well as sequence information. This information has been used to search databases of translated protein sequences as well as nucleotide databases such as expressed sequence tag (EST) sequences. The ability to search nucleotide databases is an advantage when analyzing data obtained from organisms whose genomes are not yet completed, but a large amount of expressed gene sequence is available (e.g., human and mouse). Furthermore, a strength of using tandem mass spectra to search databases is the ability to identify proteins present in fairly complex mixtures.
Chen Z, Johnson L, Trahtemberg U, Baker A, Huq S, Dufresne J Clin Proteomics. 2023; 20(1):17.
PMID: 37031181 PMC: 10082440. DOI: 10.1186/s12014-023-09394-0.
Christofi E, Barran P Chem Rev. 2023; 123(6):2902-2949.
PMID: 36827511 PMC: 10037255. DOI: 10.1021/acs.chemrev.2c00600.
Dufresne J, Florentinus-Mefailoski A, Ajambo J, Ferwa A, Bowden P, Marshall J Clin Proteomics. 2017; 14:41.
PMID: 29234243 PMC: 5721679. DOI: 10.1186/s12014-017-9176-7.
Freeze-dried plasma proteins are stable at room temperature for at least 1 year.
Dufresne J, Hoang T, Ajambo J, Florentinus-Mefailoski A, Bowden P, Marshall J Clin Proteomics. 2017; 14:35.
PMID: 29093647 PMC: 5659006. DOI: 10.1186/s12014-017-9170-0.
Proteomic analysis of Chromobacterium violaceum and its adaptability to stress.
Castro D, Cordeiro I, Taquita P, Eberlin M, Garcia J, Souza G BMC Microbiol. 2015; 15:272.
PMID: 26627076 PMC: 4666173. DOI: 10.1186/s12866-015-0606-2.