» Articles » PMID: 28115314

Discriminating Sample Groups with Multi-way Data

Overview
Journal Biostatistics
Specialty Public Health
Date 2017 Jan 25
PMID 28115314
Citations 8
Authors
Affiliations
Soon will be listed here.
Abstract

High-dimensional linear classifiers, such as distance weighted discrimination (DWD) and versions of the support vector machine (SVM), are commonly used in biomedical research to distinguish groups of subjects based on a large number of features. However, their use is limited to applications where a single vector of features is measured for each subject. In practice, data are often multi-way, or measured over multiple dimensions. For example, metabolite abundance may be measured over multiple regions or tissues, or gene expression may be measured over multiple time points, for the same subjects. We propose a framework for linear classification of high-dimensional multi-way data, in which coefficients can be factorized into weights that are specific to each dimension. More generally, the coefficients for each measurement in a multi-way dataset are assumed to have low-rank structure. This framework extends existing classification techniques from single vector to multi-way features, and we have implemented multi-way versions of SVM and DWD. We describe informative simulation results, and apply multi-way DWD to data for two very different clinical research studies. The first study uses magnetic resonance spectroscopy metabolite data over multiple brain regions to compare participants with and without spinocerebellar ataxia; the second uses publicly available gene expression time-course data to compare degrees of treatment response among patients with multiple sclerosis. Our multi-way method can improve performance and simplify interpretation over naive applications of full rank linear and non-linear classification to multi-way data. The R package is available at https://github.com/lockEF/MultiwayClassification.

Citing Articles

Multiway sparse distance weighted discrimination.

Guo B, Eberly L, Henry P, Lenglet C, Lock E J Comput Graph Stat. 2023; 32(2):730-743.

PMID: 37377729 PMC: 10292743. DOI: 10.1080/10618600.2022.2099404.


Bayesian predictive modeling of multi-source multi-way data.

Kim J, Sandri B, Rao R, Lock E Comput Stat Data Anal. 2023; 186.

PMID: 37274461 PMC: 10237362. DOI: 10.1016/j.csda.2023.107783.


Bayesian Distance Weighted Discrimination.

Lock E J Comput Graph Stat. 2022; 31(4):1177-1188.

PMID: 36465095 PMC: 9717576. DOI: 10.1080/10618600.2022.2069778.


Tandem mass tag proteomic and untargeted metabolomic profiling reveals altered serum and CSF biochemical datasets in iron deficient monkeys.

Sandri B, Kim J, Lubach G, Lock E, Guerrero C, Higgins L Data Brief. 2022; 45:108591.

PMID: 36164307 PMC: 9508431. DOI: 10.1016/j.dib.2022.108591.


Multiple sclerosis diagnosis and phenotype identification by multivariate classification of in vivo frontal cortex metabolite profiles.

Swanberg K, Kurada A, Prinsen H, Juchem C Sci Rep. 2022; 12(1):13888.

PMID: 35974117 PMC: 9381573. DOI: 10.1038/s41598-022-17741-8.


References
1.
Wimalawarne K, Tomioka R, Sugiyama M . Theoretical and Experimental Analyses of Tensor-Based Regression and Classification. Neural Comput. 2016; 28(4):686-715. DOI: 10.1162/NECO_a_00815. View

2.
Li X, Xu D, Zhou H, Li L . Tucker Tensor Regression and Neuroimaging Analysis. Stat Biosci. 2021; 10(3):520-545. PMC: 8336908. DOI: 10.1007/s12561-018-9215-6. View

3.
Oz G, Hutter D, Tkac I, Clark H, Gross M, Jiang H . Neurochemical alterations in spinocerebellar ataxia type 1 and their correlations with clinical status. Mov Disord. 2010; 25(9):1253-61. PMC: 2916651. DOI: 10.1002/mds.23067. View

4.
Zhang Y, Tibshirani R, Davis R . Classification of patients from time-course gene expression. Biostatistics. 2012; 14(1):87-98. PMC: 3520502. DOI: 10.1093/biostatistics/kxs027. View

5.
Zhou J, Bhattacharya A, Herring A, Dunson D . Bayesian factorizations of big sparse tensors. J Am Stat Assoc. 2019; 110(512):1562-1576. PMC: 6579540. DOI: 10.1080/01621459.2014.983233. View