Abstract

In associative reinforcement learning, an environment generates input vectors, a learning system generates possible output vectors, and a reinforcement function computes feedback signals from the input-output pairs. The task is to discover and remember input-output pairs that generate rewards. Especially difficult cases occur when rewards are rare, since the expected time for any algorithm can grow exponentially with the size of the problem. Nonetheless, if a reinforcement function possesses regularities, and a learning algorithm exploits them, learning time can be reduced below that of non-generalizing algorithms. This paper describes a neural network algorithm called complementary reinforcement back-propagation (CRBP), and reports simulation results on problems designed to offer differing opportunities for generalization.

Keywords

Reinforcement learningGeneralizationComputer scienceArtificial intelligenceReinforcementAssociative propertyFunction (biology)ExploitLearning classifier systemFunction approximationArtificial neural networkTask (project management)Machine learningMathematicsEngineering

Related Publications

Network In Network

Abstract: We propose a novel deep network structure called In Network (NIN) to enhance model discriminability for local patches within the receptive field. The conventional con...

2014 arXiv (Cornell University) 1037 citations

Publication Info

Year
1989
Type
article
Volume
2
Pages
550-557
Citations
54
Access
Closed

External Links

Citation Metrics

54
OpenAlex

Cite This

David H. Ackley, Michael L. Littman (1989). Generalization and Scaling in Reinforcement Learning. , 2 , 550-557.