Borkar, VS and Meyn, SP (2000) The ODE method for convergence of stochastic approximation and reinforcement learning. In: SIAM Journal on Control And Optimization, 38 (2). pp. 447-469.