» Articles » PMID: 31022945

Spatio⁻Temporal Image Representation of 3D Skeletal Movements for View-Invariant Action Recognition with Deep Convolutional Neural Networks

Overview
Journal Sensors (Basel)
Publisher MDPI
Specialty Biotechnology
Date 2019 Apr 27
PMID 31022945
Citations 7
Authors
Affiliations
Soon will be listed here.
Abstract

Designing motion representations for 3D human action recognition from skeleton sequences is an important yet challenging task. An effective representation should be robust to noise, invariant to viewpoint changes and result in a good performance with low-computational demand. Two main challenges in this task include how to efficiently represent spatio-temporal patterns of skeletal movements and how to learn their discriminative features for classification tasks. This paper presents a novel skeleton-based representation and a deep learning framework for 3D action recognition using RGB-D sensors. We propose to build an action map called SPMF (), which is a compact image representation built from skeleton poses and their motions. An Adaptive Histogram Equalization (AHE) algorithm is then applied on the SPMF to enhance their local patterns and form an enhanced action map, namely Enhanced-SPMF. For learning and classification tasks, we exploit Deep Convolutional Neural Networks based on the DenseNet architecture to learn directly an end-to-end mapping between input skeleton sequences and their action labels via the Enhanced-SPMFs. The proposed method is evaluated on four challenging benchmark datasets, including both individual actions, interactions, multiview and large-scale datasets. The experimental results demonstrate that the proposed method outperforms previous state-of-the-art approaches on all benchmark tasks, whilst requiring low computational time for training and inference.

Citing Articles

A Deep Sequence Learning Framework for Action Recognition in Small-Scale Depth Video Dataset.

Bulbul M, Ullah A, Ali H, Kim D Sensors (Basel). 2022; 22(18).

PMID: 36146186 PMC: 9506565. DOI: 10.3390/s22186841.


Skeleton Driven Action Recognition Using an Image-Based Spatial-Temporal Representation and Convolution Neural Network.

Silva V, Soares F, Leao C, Esteves J, Vercelli G Sensors (Basel). 2021; 21(13).

PMID: 34201991 PMC: 8271982. DOI: 10.3390/s21134342.


Detection of sitting posture using hierarchical image composition and deep learning.

Kulikajevas A, Maskeliunas R, Damasevicius R PeerJ Comput Sci. 2021; 7:e442.

PMID: 33834109 PMC: 8022631. DOI: 10.7717/peerj-cs.442.


Application of Machine Learning in Air Hockey Interactive Control System.

Chang C, Chen S, Chang C, Jhou Y Sensors (Basel). 2020; 20(24).

PMID: 33348665 PMC: 7767285. DOI: 10.3390/s20247233.


Prediction of Human Activities Based on a New Structure of Skeleton Features and Deep Learning Model.

Jaouedi N, Perales F, Buades J, Boujnah N, Bouhlel M Sensors (Basel). 2020; 20(17).

PMID: 32882884 PMC: 7506930. DOI: 10.3390/s20174944.


References
1.
Johansson G . Visual motion perception. Sci Am. 1975; 232(6):76-88. DOI: 10.1038/scientificamerican0675-76. View

2.
Gupta A, Kembhavi A, Davis L . Observing human-object interactions: using spatial and functional compatibility for recognition. IEEE Trans Pattern Anal Mach Intell. 2009; 31(10):1775-89. DOI: 10.1109/TPAMI.2009.83. View

3.
Gu J, Ding X, Wang S, Wu Y . Action and gait recognition from recovered 3-D human joints. IEEE Trans Syst Man Cybern B Cybern. 2010; 40(4):1021-33. DOI: 10.1109/TSMCB.2010.2043526. View

4.
Yao B, Fei-Fei L . Recognizing human-object interactions in still images by modeling the mutual context of objects and human poses. IEEE Trans Pattern Anal Mach Intell. 2012; 34(9):1691-703. DOI: 10.1109/TPAMI.2012.67. View

5.
LeCun Y, Bengio Y, Hinton G . Deep learning. Nature. 2015; 521(7553):436-44. DOI: 10.1038/nature14539. View