» Articles » PMID: 34845212

Ensuring Scientific Reproducibility in Bio-macromolecular Modeling Via Extensive, Automated Benchmarks

Abstract

Each year vast international resources are wasted on irreproducible research. The scientific community has been slow to adopt standard software engineering practices, despite the increases in high-dimensional data, complexities of workflows, and computational environments. Here we show how scientific software applications can be created in a reproducible manner when simple design goals for reproducibility are met. We describe the implementation of a test server framework and 40 scientific benchmarks, covering numerous applications in Rosetta bio-macromolecular modeling. High performance computing cluster integration allows these benchmarks to run continuously and automatically. Detailed protocol captures are useful for developers and users of Rosetta and other macromolecular modeling tools. The framework and design concepts presented here are valuable for developers and users of any type of scientific software and for the scientific community to create reproducible methods. Specific examples highlight the utility of this framework, and the comprehensive documentation illustrates the ease of adding new tests in a matter of hours.

Citing Articles

Growing Glycans in Rosetta: Accurate de novo glycan modeling, density fitting, and rational sequon design.

Adolf-Bryfogle J, Labonte J, Kraft J, Shapovalov M, Raemisch S, Lutteke T PLoS Comput Biol. 2024; 20(6):e1011895.

PMID: 38913746 PMC: 11288642. DOI: 10.1371/journal.pcbi.1011895.


Combining machine learning with structure-based protein design to predict and engineer post-translational modifications of proteins.

Ertelt M, Mulligan V, Maguire J, Lyskov S, Moretti R, Schiffner T PLoS Comput Biol. 2024; 20(3):e1011939.

PMID: 38484014 PMC: 10965067. DOI: 10.1371/journal.pcbi.1011939.


Modeling membrane geometries implicitly in Rosetta.

Woods H, Leman J, Meiler J Protein Sci. 2024; 33(3):e4908.

PMID: 38358133 PMC: 10868433. DOI: 10.1002/pro.4908.


Implicit model to capture electrostatic features of membrane environment.

Samanta R, Gray J PLoS Comput Biol. 2024; 20(1):e1011296.

PMID: 38252688 PMC: 10833867. DOI: 10.1371/journal.pcbi.1011296.


NMR and Docking Calculations Reveal Novel Atomistic Selectivity of a Synthetic High-Affinity Free Fatty Acid vs. Free Fatty Acids in Sudlow's Drug Binding Sites in Human Serum Albumin.

Venianakis T, Primikyri A, Opatz T, Petry S, Papamokos G, Gerothanassis I Molecules. 2023; 28(24).

PMID: 38138481 PMC: 10745614. DOI: 10.3390/molecules28247991.


References
1.
. PSYCHOLOGY. Estimating the reproducibility of psychological science. Science. 2015; 349(6251):aac4716. DOI: 10.1126/science.aac4716. View

2.
Alford R, Leman J, Weitzner B, Duran A, Tilley D, Elazar A . An Integrated Framework Advancing Membrane Protein Modeling and Design. PLoS Comput Biol. 2015; 11(9):e1004398. PMC: 4556676. DOI: 10.1371/journal.pcbi.1004398. View

3.
Fleishman S, Leaver-Fay A, Corn J, Strauch E, Khare S, Koga N . RosettaScripts: a scripting language interface to the Rosetta macromolecular modeling suite. PLoS One. 2011; 6(6):e20161. PMC: 3123292. DOI: 10.1371/journal.pone.0020161. View

4.
Hosseinzadeh P, Bhardwaj G, Mulligan V, Shortridge M, Craven T, Pardo-Avila F . Comprehensive computational design of ordered peptide macrocycles. Science. 2017; 358(6369):1461-1466. PMC: 5860875. DOI: 10.1126/science.aap7577. View

5.
Alford R, Fleming P, Fleming K, Gray J . Protein Structure Prediction and Design in a Biologically Realistic Implicit Membrane. Biophys J. 2020; 118(8):2042-2055. PMC: 7175592. DOI: 10.1016/j.bpj.2020.03.006. View