Santharam, G and Sastry, PS (1997) A reinforcement learning neural network for adaptive control of Markov chains. In: Ieee Transactions On Systems, Man, And Cybernetics—Part A: Systems And Humans, 27 (5). pp. 588-600.