Design and Analysis of Massively Parallel Reporter Assays Using FORECAST
Overview
Affiliations
Machine learning is revolutionizing molecular biology and bioengineering by providing powerful insights and predictions. Massively parallel reporter assays (MPRAs) have emerged as a particularly valuable class of high-throughput technique to support such algorithms. MPRAs enable the simultaneous characterization of thousands or even millions of genetic constructs and provide the large amounts of data needed to train models. However, while the scale of this approach is impressive, the design of effective MPRA experiments is challenging due to the many factors that can be varied and the difficulty in predicting how these will impact the quality and quantity of data obtained. Here, we present a computational tool called FORECAST, which can simulate MPRA experiments based on fluorescence-activated cell sorting and subsequent sequencing (commonly referred to as Flow-seq or Sort-seq experiments), as well as carry out rigorous statistical estimation of construct performance from this type of experimental data. FORECAST can be used to develop workflows to aid the design of MPRA experiments and reanalyze existing MPRA data sets.
Data hazards in synthetic biology.
Zelenka N, Di Cara N, Sharma K, Sarvaharman S, Ghataora J, Parmeggiani F Synth Biol (Oxf). 2024; 9(1):ysae010.
PMID: 38973982 PMC: 11227101. DOI: 10.1093/synbio/ysae010.
Transfer learning for cross-context prediction of protein expression from 5'UTR sequence.
Gilliot P, Gorochowski T Nucleic Acids Res. 2024; 52(13):e58.
PMID: 38864396 PMC: 11260469. DOI: 10.1093/nar/gkae491.
Applications of artificial intelligence and machine learning in dynamic pathway engineering.
Merzbacher C, Oyarzun D Biochem Soc Trans. 2023; 51(5):1871-1879.
PMID: 37656433 PMC: 10657174. DOI: 10.1042/BST20221542.