Reducing Sample Complexity in Reinforcement Learning by Transferring Transition and Reward Probabilities
Proceedings of the 6th International Conference on Agents and Artificial Intelligence (ICAART2014),
Vol.1, pp.632-638
(2014), [peer-reviewed]
Event Date:
March 6-8, 2014
Abstract / 概要
Most existing reinforcement learning algorithms require many trials until they obtain optimal policies. In this study, we apply transfer learning to reinforcement learning to realize greater efficiency. We propose a new algorithm called TR-MAX, based on the R-MAX algorithm. TR-MAX transfers the transition and reward probabilities from a source task to a target task as prior knowledge. We theoretically analyze the sample complexity of TR-MAX. Moreover, we show that TR-MAX performs much better in practice than R-MAX in maze tasks.