Abstract
In associative reinforcement learning, an environment generates input vectors, a learning system generates possible output vectors, and a reinforcement function computes feedback signals from the input-output pairs. The task is to discover and remember input-output pairs that generate rewards. Especially difficult cases occur when rewards are rare, since the expected time for any algorithm can grow exponentially with the size of the problem. Nonetheless, if a reinforcement function possesses regularities, and a learning algorithm exploits them, learning time can be reduced below that of non-generalizing algorithms. This paper describes a neural network algorithm called complementary reinforcement back-propagation (CRBP), and reports simulation results on problems designed to offer differing opportunities for generalization.
Keywords
Related Publications
Neural computation by concentrating information in time.
An analog model neural network that can solve a general problem of recognizing patterns in a time-dependent signal is presented. The networks use a patterned set of delays to co...
Single-Image Super-Resolution Using Sparse Regression and Natural Image Prior
This paper proposes a framework for single-image super-resolution. The underlying idea is to learn a map from input low-resolution images to target high-resolution images based ...
A new optimizer using particle swarm theory
The optimization of nonlinear functions using particle swarm methodology is described. Implementations of two paradigms are discussed and compared, including a recently develope...
Network In Network
Abstract: We propose a novel deep network structure called In Network (NIN) to enhance model discriminability for local patches within the receptive field. The conventional con...
Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation
This work presents a new perspective on characterizing the similarity between elements of a database or, more generally, nodes of a weighted and undirected graph. It is based on...
Publication Info
- Year
- 1989
- Type
- article
- Volume
- 2
- Pages
- 550-557
- Citations
- 54
- Access
- Closed