Abstract:
Reinforcement Learning (RL) is a learning control paradigm that
provides well-understood algorithms with good convergence and
consistency properties. Unfortunately, these algorithms require that
process states and control actions take only discrete values.
Approximate solutions using fuzzy representations have been proposed
in the literature for the case when the states and possibly the
actions are continuous. However, the link between these mainly
heuristic solutions and the larger body of work on approximate RL,
including convergence results, has not been made explicit. In this
paper, we propose a fuzzy approximation structure for the Q-value
iteration algorithm, and show that the resulting algorithm is
convergent. The proof is based on an extension of previous results in
approximate RL. We then propose a modified, serial version of the
algorithm that is guaranteed to converge at least as fast as the
original algorithm. An illustrative simulation example is also
provided.