» Articles » PMID: 39836686

Data-driven Model Discovery and Model Selection for Noisy Biological Systems

Overview
Date 2025 Jan 21
PMID 39836686
Authors
Affiliations
Soon will be listed here.
Abstract

Biological systems exhibit complex dynamics that differential equations can often adeptly represent. Ordinary differential equation models are widespread; until recently their construction has required extensive prior knowledge of the system. Machine learning methods offer alternative means of model construction: differential equation models can be learnt from data via model discovery using sparse identification of nonlinear dynamics (SINDy). However, SINDy struggles with realistic levels of biological noise and is limited in its ability to incorporate prior knowledge of the system. We propose a data-driven framework for model discovery and model selection using hybrid dynamical systems: partial models containing missing terms. Neural networks are used to approximate the unknown dynamics of a system, enabling the denoising of the data while simultaneously learning the latent dynamics. Simulations from the fitted neural network are then used to infer models using sparse regression. We show, via model selection, that model discovery using hybrid dynamical systems outperforms alternative approaches. We find it possible to infer models correctly up to high levels of biological noise of different types. We demonstrate the potential to learn models from sparse, noisy data in application to a canonical cell state transition using data derived from single-cell transcriptomics. Overall, this approach provides a practical framework for model discovery in biology in cases where data are noisy and sparse, of particular utility when the underlying biological mechanisms are partially but incompletely known.

References
1.
Nardini J, Baker R, Simpson M, Flores K . Learning differential equation models from stochastic agent-based model simulations. J R Soc Interface. 2021; 18(176):20200987. PMC: 8086865. DOI: 10.1098/rsif.2020.0987. View

2.
Ochi S, Imaizumi Y, Shimojo H, Miyachi H, Kageyama R . Oscillatory expression of Hes1 regulates cell proliferation and neuronal differentiation in the embryonic brain. Development. 2020; 147(4). DOI: 10.1242/dev.182204. View

3.
Messenger D, Bortz D . WEAK SINDY FOR PARTIAL DIFFERENTIAL EQUATIONS. J Comput Phys. 2021; 443. PMC: 8570254. DOI: 10.1016/j.jcp.2021.110525. View

4.
Virtanen P, Gommers R, Oliphant T, Haberland M, Reddy T, Cournapeau D . SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods. 2020; 17(3):261-272. PMC: 7056644. DOI: 10.1038/s41592-019-0686-2. View

5.
van Dijk D, Sharma R, Nainys J, Yim K, Kathail P, Carr A . Recovering Gene Interactions from Single-Cell Data Using Data Diffusion. Cell. 2018; 174(3):716-729.e27. PMC: 6771278. DOI: 10.1016/j.cell.2018.05.061. View