ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

The Reinforce Policy Gradient Algorithm Revisited

Bhatnagar, S The Reinforce Policy Gradient Algorithm Revisited. In: UNSPECIFIED, p. 177.

[img] PDF
9th_Ind_con_con_2023. pdf - Published Version

Download (203kB)

Abstract

We revisit the Reinforce policy gradient algorithm that works with full cost returns obtained over random length episodes. We propose a new Reinforce type algorithm that estimates the policy gradient using a function measurement over a perturbed parameter using a smoothed functional based gradient estimator. We observe that even though we estimate the gradient of the performance objective using sample performance (and not the sample gradient), the algorithm converges to a neighborhood of a local minimum. We further describe the main convergence result. © 2023 IEEE.

Item Type: Conference Paper
Publication: 2023 9th Indian Control Conference, ICC 2023 - Proceedings
Department/Centre: Division of Electrical Sciences > Computer Science & Automation
Date Deposited: 16 May 2024 08:48
Last Modified: 16 May 2024 08:48
URI: https://eprints.iisc.ac.in/id/eprint/84513

Actions (login required)

View Item View Item