» Articles » PMID: 39409288

KSL-POSE: A Real-Time 2D Human Pose Estimation Method Based on Modified YOLOv8-Pose Framework

Overview
Journal Sensors (Basel)
Publisher MDPI
Specialty Biotechnology
Date 2024 Oct 16
PMID 39409288
Authors
Affiliations
Soon will be listed here.
Abstract

Two-dimensional human pose estimation aims to equip computers with the ability to accurately recognize human keypoints and comprehend their spatial contexts within media content. However, the accuracy of real-time human pose estimation diminishes when processing images with occluded body parts or overlapped individuals. To address these issues, we propose a method based on the YOLO framework. We integrate the convolutional concepts of Kolmogorov-Arnold Networks (KANs) through introducing non-linear activation functions to enhance the feature extraction capabilities of the convolutional kernels. Moreover, to improve the detection of small target keypoints, we integrate the cross-stage partial (CSP) approach and utilize the small object enhance pyramid (SOEP) module for feature integration. We also innovatively incorporate a layered shared convolution with batch normalization detection head (LSCB), consisting of multiple shared convolutional layers and batch normalization layers, to enable cross-stage feature fusion and address the low utilization of model parameters. Given the structure and purpose of the proposed model, we name it KSL-POSE. Compared to the baseline model YOLOv8l-POSE, KSL-POSE achieves significant improvements, increasing the average detection accuracy by 1.5% on the public MS COCO 2017 data set. Furthermore, the model also demonstrates competitive performance on the CrowdPOSE data set, thus validating its generalization ability.

Citing Articles

MambaPose: A Human Pose Estimation Based on Gated Feedforward Network and Mamba.

Zhang J, Hou J, He Q, Yuan Z, Xue H Sensors (Basel). 2025; 24(24.

PMID: 39771893 PMC: 11679066. DOI: 10.3390/s24248158.

References
1.
Donahue J, Hendricks L, Rohrbach M, Venugopalan S, Guadarrama S, Saenko K . Long-Term Recurrent Convolutional Networks for Visual Recognition and Description. IEEE Trans Pattern Anal Mach Intell. 2016; 39(4):677-691. DOI: 10.1109/TPAMI.2016.2599174. View

2.
Li Y, Jia S, Li Q . BalanceHRNet: An effective network for bottom-up human pose estimation. Neural Netw. 2023; 161:297-305. DOI: 10.1016/j.neunet.2023.01.036. View