| A Pulse-Based Reinforcement Algorithm For Learning Continuous Functions (2007) | |||||||||||||
Abstract | |||||||||||||
| Introduction Reinforcement learning [1] has many attractions for neural networks; it is more widely applicable than supervised techniques (since target values are not needed for any of the processing nodes), it is biologically plausible, and it is less prone to being trapped in local minima than classical methods like error backpropagation. However standard reinforcement techniques have a major disadvantage in that they were developed to apply to systems with only a finite number of behavioural responses, which makes them unsuitable for learning real-valued functions. In the case of the A RP neural model [1] there are two such responses: output 0 (off, not firing) and output 1 (on, firing). The A RP reinforcement learning rule has the inevitable effect of driving neurons into saturation, changing weight parameters so that neurons will eventually output 0 or 1 with certainty for almost any input pattern. This behaviour is adequate for some contro | |||||||||||||
Publication details | |||||||||||||
| |||||||||||||