Department Seminar

Exact methods to solve Markov Decision Processes (MDPs) are ineffective in practice either because there are a large number of states or because the model of the MDP is not known. Approximate Dynamic Programming (ADP) algorithms are approximate solution methods for MDPs with large number of states. Reinforcement Learning (RL) algorithms are sample trajectory based solution methods for MDPs. The talk focuses on conditions that guarantee performance of ADP algorithms and stability of RL algorithms.