Deep Learning for Human Activity Recognition on 3D Human Skeleton: Survey and Comparative Study
Overview
Affiliations
Human activity recognition (HAR) is an important research problem in computer vision. This problem is widely applied to building applications in human-machine interactions, monitoring, etc. Especially, HAR based on the human skeleton creates intuitive applications. Therefore, determining the current results of these studies is very important in selecting solutions and developing commercial products. In this paper, we perform a full survey on using deep learning to recognize human activity based on three-dimensional (3D) human skeleton data as input. Our research is based on four types of deep learning networks for activity recognition based on extracted feature vectors: Recurrent Neural Network (RNN) using extracted activity sequence features; Convolutional Neural Network (CNN) uses feature vectors extracted based on the projection of the skeleton into the image space; Graph Convolution Network (GCN) uses features extracted from the skeleton graph and the temporal-spatial function of the skeleton; Hybrid Deep Neural Network (Hybrid-DNN) uses many other types of features in combination. Our survey research is fully implemented from models, databases, metrics, and results from 2019 to March 2023, and they are presented in ascending order of time. In particular, we also carried out a comparative study on HAR based on a 3D human skeleton on the KLHA3D 102 and KLYOGA3D datasets. At the same time, we performed analysis and discussed the obtained results when applying CNN-based, GCN-based, and Hybrid-DNN-based deep learning networks.
Crouzet A, Lopez N, Riss Yaw B, Lepelletier Y, Demange L Molecules. 2024; 29(12).
PMID: 38930784 PMC: 11206022. DOI: 10.3390/molecules29122716.
Biosensor-Based Multimodal Deep Human Locomotion Decoding via Internet of Healthcare Things.
Javeed M, Abdelhaq M, Algarni A, Jalal A Micromachines (Basel). 2023; 14(12).
PMID: 38138373 PMC: 10745656. DOI: 10.3390/mi14122204.
DSA-Net: Infrared and Visible Image Fusion via Dual-Stream Asymmetric Network.
Yin R, Yang B, Huang Z, Zhang X Sensors (Basel). 2023; 23(16).
PMID: 37631634 PMC: 10459630. DOI: 10.3390/s23167097.
Multi-Camera-Based Human Activity Recognition for Human-Robot Collaboration in Construction.
Jang Y, Jeong I, Younesi Heravi M, Sarkar S, Shin H, Ahn Y Sensors (Basel). 2023; 23(15).
PMID: 37571779 PMC: 10422633. DOI: 10.3390/s23156997.