ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

ERLP: Ensembles of Reinforcement Learning Policies

Saphal, R and Ravindran, B and Mudigere, D and Avancha, S and Kaul, B (2020) ERLP: Ensembles of Reinforcement Learning Policies. In: 34th AAAI Conference on Artificial Intelligence, 7-12 Feb 2020, New York, pp. 13905-13906.

[img] PDF
AAAI_2020.pdf - Published Version
Restricted to Registered users only

Download (801kB) | Request a copy


Reinforcement learning algorithms are sensitive to hyperparameters and require tuning and tweaking for specific environments for improving performance. Ensembles of reinforcement learning models on the other hand are known to be much more robust and stable. However, training multiple models independently on an environment suffers from high sample complexity. We present here a methodology to create multiple models from a single training instance that can be used in an ensemble through directed perturbation of the model parameters at regular intervals. This allows training a single model that converges to several local minima during the optimization process as a result of the perturbation. By saving the model parameters at each such instance, we obtain multiple policies during training that are ensembled during evaluation.We evaluate our approach on challenging discrete and continuous control tasks and also discuss various ensembling strategies. Our framework is substantially sample efficient, computationally inexpensive and is seen to outperform state of the art (SOTA) approaches. © 2020 The Twenty-Fifth AAAI/SIGAI Doctoral Consortium (AAAI-20). All Rights Reserved.

Item Type: Conference Paper
Publication: AAAI 2020 - 34th AAAI Conference on Artificial Intelligence
Publisher: AAAI press
Additional Information: The copyright for this article belongs to AAAI press
Keywords: Reinforcement learning, Continuous control; Hyperparameters; Improving performance; Local minimums; Model parameters; Reinforcement learning models; Sample complexity; State of the art, Learning algorithms
Department/Centre: Division of Interdisciplinary Sciences > Robert Bosch Centre for Cyber Physical Systems
Date Deposited: 18 Aug 2021 09:54
Last Modified: 18 Aug 2021 09:54
URI: http://eprints.iisc.ac.in/id/eprint/69290

Actions (login required)

View Item View Item