PDBench: Evaluating Computational Methods for Protein-sequence Design
Overview
Affiliations
Summary: Ever increasing amounts of protein structure data, combined with advances in machine learning, have led to the rapid proliferation of methods available for protein-sequence design. In order to utilize a design method effectively, it is important to understand the nuances of its performance and how it varies by design target. Here, we present PDBench, a set of proteins and a number of standard tests for assessing the performance of sequence-design methods. PDBench aims to maximize the structural diversity of the benchmark, compared with previous benchmarking sets, in order to provide useful biological insight into the behaviour of sequence-design methods, which is essential for evaluating their performance and practical utility. We believe that these tools are useful for guiding the development of novel sequence design algorithms and will enable users to choose a method that best suits their design target.
Availability And Implementation: https://github.com/wells-wood-research/PDBench.
Supplementary Information: Supplementary data are available at Bioinformatics online.
Wang H, Liu D, Zhao K, Wang Y, Zhang G Brief Bioinform. 2024; 25(3).
PMID: 38600663 PMC: 11006797. DOI: 10.1093/bib/bbae146.
What does it take for an 'AlphaFold Moment' in functional protein engineering and design?.
Chica R, Ferruz N Nat Biotechnol. 2024; 42(2):173-174.
PMID: 38361055 DOI: 10.1038/s41587-023-02120-z.
Pitman C, Santiago-McRae E, Lohia R, Bassi K, Joseph T, Hansen M bioRxiv. 2024; .
PMID: 38293114 PMC: 10827107. DOI: 10.1101/2024.01.15.575761.
TIMED-Design: flexible and accessible protein sequence design with convolutional neural networks.
Castorina L, Unal S, Subr K, Wood C Protein Eng Des Sel. 2024; 37.
PMID: 38288671 PMC: 10939383. DOI: 10.1093/protein/gzae002.
Multi-indicator comparative evaluation for deep learning-based protein sequence design methods.
Yu J, Mu J, Wei T, Chen H Bioinformatics. 2024; 40(2).
PMID: 38261649 PMC: 10868333. DOI: 10.1093/bioinformatics/btae037.