Trajectory based Deep Policy Search for Quadrupedal Walking

Kolathaya, S and Ghosal, A and Amrutur, B and Joglekar, A and Shetty, S and Dholakiya, D and Abhimanyu, . and Sagi, A and Bhattacharya, S and Singla, A and Bhatnagar, S (2019) Trajectory based Deep Policy Search for Quadrupedal Walking. In: 28th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2019, 14-18 October 2019, New Delhi; India.

PDF
RO-MAN 2019.pdf - Published Version
Restricted to Registered users only
Download (1MB) | Request a copy

Official URL: https://dx.doi.org/10.1109/RO-MAN46459.2019.895636...

Abstract

In this paper, we explore a specific form of deep reinforcement learning (D-RL) technique for quadrupedal walking - trajectory based policy search via deep policy networks. Existing approaches determine optimal policies for each time step, whereas we propose to determine an optimal policy for each walking step. We justify our approach based on the fact that animals including humans use 'low' dimensional trajectories at the joint level to realize walking. We will construct these trajectories by using BÃ©zier polynomials, with the coefficients being determined by a parameterized policy. In order to maintain smoothness of the trajectories during step transitions, hybrid invariance conditions are also applied. The action is computed at the beginning of every step, and a linear PD control law is applied to track at the individual joints. After each step, reward is computed, which is then used to update the new policy parameters for the next step. After learning an optimal policy, i.e., an optimal walking gait for each step, we then successfully play them in a custom built quadruped robot, Stoch 2, thereby validating our approach. Â© 2019 IEEE.

Item Type:	Conference Paper
Publication:	2019 28th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2019
Publisher:	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Additional Information:	Copyright of this article belongs to IEEE
Keywords:	Reinforcement learning; Robots; Trajectories, Deep-RL; Invariance condition; Optimal policies; Quadruped; Quadruped Robots; Step transitions; Trajectory-based; Walking trajectory, Deep learning
Department/Centre:	Division of Interdisciplinary Sciences > Robert Bosch Centre for Cyber Physical Systems
Date Deposited:	04 Mar 2020 10:27
Last Modified:	04 Mar 2020 10:27
URI:	http://eprints.iisc.ac.in/id/eprint/64613

Actions (login required)

View Item


	Powered by EPrints		A service from The J.R.D. Tata Memorial Library Indian Institute of Science, Bengaluru-560012, India