Extracting Biologically Significant Patterns from Short Time Series Gene Expression Data
Overview
Affiliations
Background: Time series gene expression data analysis is used widely to study the dynamics of various cell processes. Most of the time series data available today consist of few time points only, thus making the application of standard clustering techniques difficult.
Results: We developed two new algorithms that are capable of extracting biological patterns from short time point series gene expression data. The two algorithms, ASTRO and MiMeSR, are inspired by the rank order preserving framework and the minimum mean squared residue approach, respectively. However, ASTRO and MiMeSR differ from previous approaches in that they take advantage of the relatively few number of time points in order to reduce the problem from NP-hard to linear. Tested on well-defined short time expression data, we found that our approaches are robust to noise, as well as to random patterns, and that they can correctly detect the temporal expression profile of relevant functional categories. Evaluation of our methods was performed using Gene Ontology (GO) annotations and chromatin immunoprecipitation (ChIP-chip) data.
Conclusion: Our approaches generally outperform both standard clustering algorithms and algorithms designed specifically for clustering of short time series gene expression data. Both algorithms are available at http://www.benoslab.pitt.edu/astro/.
Lee H, Jeon Y, Kim Y, Jang J, Cho Y, Bhak J Sci Rep. 2021; 11(1):12317.
PMID: 34112891 PMC: 8192508. DOI: 10.1038/s41598-021-91811-1.
Wang J, Choi H, Chung N, Cao Q, Ng D, Mirza B J Proteome Res. 2018; 17(12):4243-4257.
PMID: 30141336 PMC: 6650147. DOI: 10.1021/acs.jproteome.8b00372.
Tchagang A, Phan S, Famili F, Shearer H, Fobert P, Huang Y BMC Bioinformatics. 2012; 13:54.
PMID: 22475802 PMC: 3376030. DOI: 10.1186/1471-2105-13-54.
Zhu Y, Cao Z, Xu F, Huang Y, Chen M, Guo W Theor Appl Genet. 2011; 124(3):515-31.
PMID: 22042481 DOI: 10.1007/s00122-011-1725-2.
A platform for processing expression of short time series (PESTS).
Sinha A, Markatou M BMC Bioinformatics. 2011; 12:13.
PMID: 21223570 PMC: 3027112. DOI: 10.1186/1471-2105-12-13.