Abstract

Human pose estimation is an interesting and underlying topic in various fields such as action recognition and human-computer interaction. Although many methods have been developed recently, they are still far from perfect in accuracy and speed at a time. In this paper, we propose a Classification-based Pose Estimation Network with Multi-task Learning (CPENML) based on the low-resolution feature map to improve accuracy and inference time simultaneously. The proposed CPENML consists of two ideas. Firstly, novel proposed keypoint and offset estimation tasks based on classification achieve better performance than regression. Secondly, the proposed Multi-Scale Network (MSN) makes robust feature maps and balances the keypoint and offset tasks to maximize performance. To prove the effectiveness of the proposed method, we conduct ablation studies on the COCO dataset for proposed ideas. Compared to benchmarks, we demonstrate the superiority of our proposed method on COCO dataset in terms of inference time and accuracy.

COMPUTER VISION

Classification-based Multi-task Learning for Efficient Pose Estimation Network

International Conference on Pattern Recognition (ICPR)

2022-08-21

Abstract