The Signature Molecular Descriptor. 4. Canonizing Molecules Using Extended Valence Sequences
Overview
Medical Informatics
Affiliations
We present a new algorithm to canonize molecular graphs using the signature molecular descriptor introduced in the previous papers of this series. While developed specifically for molecular structures, the algorithm can be used for any graph and is not limited to acyclic graphs, planar graphs, bounded valence, or bounded genus graphs, for which polynomial time algorithms exist. The algorithm is tested with benzenoid hydrocarbons and a database of 126,705 organic compounds. The algorithm's performances are compared against Brendan Mc Kay's Nauty algorithm, which is believed to be the fastest graph canonization algorithm for general graphs, with five series of graphs each comprising up to 30,000 vertices: 2D meshes (pericondensed benzenoids), 3D cages (fullerenes and nanotubes), 3D meshes (crystal lattices), 4D cages, and power law graphs (protein and gene networks). The algorithm can be downloaded as an open source code at http://www.cs.sandia.gov/ approximately jfaulon/QSAR.
Bequignon O, Gomez-Tamayo J, Lenselink E, Wink S, Hiemstra S, Lam C J Chem Inf Model. 2023; 63(17):5433-5445.
PMID: 37616385 PMC: 10498489. DOI: 10.1021/acs.jcim.3c00220.
Development of an open-source software for isomer enumeration.
Rieder S, Oliveira M, Riniker S, Hunenberger P J Cheminform. 2023; 15(1):10.
PMID: 36683047 PMC: 9867865. DOI: 10.1186/s13321-022-00677-6.
MORTAR: a rich client application for in silico molecule fragmentation.
Bansch F, Schaub J, Sevindik B, Behr S, Zander J, Steinbeck C J Cheminform. 2023; 15(1):1.
PMID: 36593523 PMC: 9809053. DOI: 10.1186/s13321-022-00674-9.
Assessing the calibration in toxicological in vitro models with conformal prediction.
Morger A, Svensson F, Arvidsson McShane S, Gauraha N, Norinder U, Spjuth O J Cheminform. 2021; 13(1):35.
PMID: 33926567 PMC: 8082859. DOI: 10.1186/s13321-021-00511-5.
Krotko D J Cheminform. 2021; 12(1):48.
PMID: 33431026 PMC: 7439248. DOI: 10.1186/s13321-020-00453-4.