ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Kinematic-structure-preserved representation for unsupervised 3d human pose estimation

Kundu, JN and Seth, S and Rahul, MV and Rakesh, M and Babu, RV and Chakraborty, A (2020) Kinematic-structure-preserved representation for unsupervised 3d human pose estimation. In: AAAI 2020 - 34th AAAI Conference on Artificial Intelligence, 7 - 12 February 2020, New York, pp. 11312-11319.

[img] PDF
AAAI _020.pdf - Published Version
Restricted to Registered users only

Download (2MB) | Request a copy
Official URL: https://doi.org/10.1609/aaai.v34i07.6792


Estimation of 3D human pose from monocular image has gained considerable attention, as a key step to several humancentric applications. However, generalizability of human pose estimation models developed using supervision on large-scale in-studio datasets remains questionable, as these models often perform unsatisfactorily on unseen in-the-wild environments. Though weakly-supervised models have been proposed to address this shortcoming, performance of such models relies on availability of paired supervision on some related task, such as 2D pose or multi-view image pairs. In contrast, we propose a novel kinematic-structure-preserved unsupervised 3D pose estimation framework, which is not restrained by any paired or unpaired weak supervisions. Our pose estimation framework relies on a minimal set of prior knowledge that defines the underlying kinematic 3D structure, such as skeletal joint connectivity information with bone-length ratios in a fixed canonical scale. The proposed model employs three consecutive differentiable transformations namely forwardkinematics, camera-projection and spatial-map transformation. This design not only acts as a suitable bottleneck stimulating effective pose disentanglement, but also yields interpretable latent pose representations avoiding training of an explicit latent embedding to pose mapper. Furthermore, devoid of unstable adversarial setup, we re-utilize the decoder to formalize an energy-based loss, which enables us to learn from in-the-wild videos, beyond laboratory settings. Comprehensive experiments demonstrate our state-of-the-art unsupervised and weakly-supervised pose estimation performance on both Human3.6M and MPI-INF-3DHP datasets. Qualitative results on unseen environments further establish our superior generalization ability. © 2020, Association for the Advancement of Artificial Intelligence.

Item Type: Conference Paper
Publication: AAAI 2020 - 34th AAAI Conference on Artificial Intelligence
Publisher: AAAI press
Additional Information: The copyright for this article belongs to AAAI press.
Keywords: Kinematics; Large dataset, 3D human pose estimation; 3D pose estimation; Camera projection; Generalization ability; Human pose estimations; Kinematic structures; Multi-view image; State of the art, Artificial intelligence
Department/Centre: Division of Interdisciplinary Sciences > Computational and Data Sciences
Date Deposited: 07 Feb 2023 10:49
Last Modified: 07 Feb 2023 10:49
URI: https://eprints.iisc.ac.in/id/eprint/80015

Actions (login required)

View Item View Item