On the use of backpropagation in associative reinforcement learning

Williams

doi:10.1109/icnn.1988.23856

Abstract

A description is given of several ways that backpropagation can be useful in training networks to perform associative reinforcement learning tasks. One way is to train a second network to model the environmental reinforcement signal and to backpropagate through this network into the first network. This technique has been proposed and explored previously in various forms. Another way is based on the use of the reinforce algorithm and amounts to backpropagating through deterministic parts of the network while performing a correlation-style computation where the behavior is stochastic. A third way, which is an extension of the second, allows backpropagation through the stochastic parts of the network as well. The mathematical validity of this third technique rests on the use of continuous-valued stochastic units. Some implications of this result for using supervised learning to train networks of stochastic units are noted, and it is also observed that such an approach even permits a seamless blend of associative reinforcement learning and supervised learning within the same network.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>

Keywords

BackpropagationComputer scienceAssociative propertyReinforcement learningArtificial intelligenceAssociative learningArtificial neural networkMachine learningSupervised learningMathematics

Affiliated Institutions

Northeastern University US

Related Publications

Pattern-recognizing stochastic learning automata

Andrew G. Barto , P. Anandan

A class of learning tasks is described that combines aspects of learning automation tasks and supervised learning pattern-classification tasks. These tasks are called associativ...

1985 IEEE Transactions on Systems Man and ... 309 citations

BPS: a learning algorithm for capturing the dynamic nature of speech

Gori , Bengio , De Mori

A novel backpropagation learning algorithm for a particular class of dynamic neural networks in which some units have a local feedback is proposed. Hence these networks can be t...

1989 58 citations

CDMA-IC: a novel code division multiple access scheme based on interference cancellation

P. Dent , B. Gudmundson , M. Ewerbring

Third generation cellular systems will need to increase capacity significantly from previous generations. A system based on code division multiple access may be of interest prov...

2003 84 citations

Fast self-organization by the probing algorithm

Lampinen , Oja

A new computational algorithm, the probing algorithm, is introduced for the subproblem of finding the best matching unit in Kohonen's self-organization procedure (Self-Organizat...

1989 International Joint Conference on Neu... 26 citations

Statistical pattern recognition with neural networks: benchmarking studies

Kohonen , Barna , Ron Chrisley

Three basic types of neural-like networks (backpropagation network, Boltzmann machine, and learning vector quantization), were applied to two representative artificial statistic...

1988 IEEE International Conference on Neur... 347 citations

Publication Info

Year: 1988
Type: article
Pages: 263-270 vol.1
Citations: 74
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

On the use of backpropagation in associative reinforcement learning

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

OpenAlex

Cite This

APA Style

                            
                                    Williams
                                
                            (1988). 
                            On the use of backpropagation in associative reinforcement learning. 
                            IEEE International Conference on Neural Networks
                            
                            , 263-270 vol.1.
                            https://doi.org/10.1109/icnn.1988.23856

Identifiers

DOI: 10.1109/icnn.1988.23856