TY - JOUR
T1 - Transferring knowledge as heuristics in reinforcement learning
T2 - A case-based approach
AU - Bianchi, Reinaldo A.C.
AU - Celiberto, Luiz A.
AU - Santos, Paulo E.
AU - Matsuura, Jackson P.
AU - Lopez De Mantaras, Ramon
PY - 2015/9
Y1 - 2015/9
N2 - Abstract: The goal of this paper is to propose and analyse a transfer learning meta-algorithm that allows the implementation of distinct methods using heuristics to accelerate a Reinforcement Learning procedure in one domain (the target) that are obtained from another (simpler) domain (the source domain). This meta-algorithm works in three stages: first, it uses a Reinforcement Learning step to learn a task on the source domain, storing the knowledge thus obtained in a case base; second, it does an unsupervised mapping of the source-domain actions to the target-domain actions; and, third, the case base obtained in the first stage is used as heuristics to speed up the learning process in the target domain. A set of empirical evaluations were conducted in two target domains: the 3D mountain car (using a learned case base from a 2D simulation) and stability learning for a humanoid robot in the Robocup 3D Soccer Simulator (that uses knowledge learned from the Acrobot domain). The results attest that our transfer learning algorithm outperforms recent heuristically-accelerated reinforcement learning and transfer learning algorithms.
AB - Abstract: The goal of this paper is to propose and analyse a transfer learning meta-algorithm that allows the implementation of distinct methods using heuristics to accelerate a Reinforcement Learning procedure in one domain (the target) that are obtained from another (simpler) domain (the source domain). This meta-algorithm works in three stages: first, it uses a Reinforcement Learning step to learn a task on the source domain, storing the knowledge thus obtained in a case base; second, it does an unsupervised mapping of the source-domain actions to the target-domain actions; and, third, the case base obtained in the first stage is used as heuristics to speed up the learning process in the target domain. A set of empirical evaluations were conducted in two target domains: the 3D mountain car (using a learned case base from a 2D simulation) and stability learning for a humanoid robot in the Robocup 3D Soccer Simulator (that uses knowledge learned from the Acrobot domain). The results attest that our transfer learning algorithm outperforms recent heuristically-accelerated reinforcement learning and transfer learning algorithms.
KW - Case-based reasoning
KW - Reinforcement learning
KW - Transfer learning
UR - http://www.scopus.com/inward/record.url?scp=84930960233&partnerID=8YFLogxK
U2 - 10.1016/j.artint.2015.05.008
DO - 10.1016/j.artint.2015.05.008
M3 - Article
AN - SCOPUS:84930960233
SN - 0004-3702
VL - 226
SP - 102
EP - 121
JO - Artificial Intelligence
JF - Artificial Intelligence
ER -