TY - GEN
T1 - A method for the online construction of the set of states of a Markov decision process using answer set programming
AU - Ferreira, Leonardo Anjoletto
AU - Bianchi, Reinaldo A.C.
AU - Santos, Paulo E.
AU - de Mantaras, Ramon Lopez
PY - 2018/5/30
Y1 - 2018/5/30
N2 - Non-stationary domains, that change in unpredicted ways, are a challenge for agents searching for optimal policies in sequential decision-making problems. This paper presents a combination of Markov Decision Processes (MDP) with Answer Set Programming (ASP), named Online ASP for MDP (oASP(MDP)), which is a method capable of constructing the set of domain states while the agent interacts with a changing environment. oASP(MDP) updates previously obtained policies, learnt by means of Reinforcement Learning (RL), using rules that represent the domain changes observed by the agent. These rules represent a set of domain constraints that are processed as ASP programs reducing the search space. Results show that oASP(MDP) is capable of finding solutions for problems in non-stationary domains without interfering with the action-value function approximation process.
AB - Non-stationary domains, that change in unpredicted ways, are a challenge for agents searching for optimal policies in sequential decision-making problems. This paper presents a combination of Markov Decision Processes (MDP) with Answer Set Programming (ASP), named Online ASP for MDP (oASP(MDP)), which is a method capable of constructing the set of domain states while the agent interacts with a changing environment. oASP(MDP) updates previously obtained policies, learnt by means of Reinforcement Learning (RL), using rules that represent the domain changes observed by the agent. These rules represent a set of domain constraints that are processed as ASP programs reducing the search space. Results show that oASP(MDP) is capable of finding solutions for problems in non-stationary domains without interfering with the action-value function approximation process.
UR - http://www.scopus.com/inward/record.url?scp=85049012571&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-92058-0_1
DO - 10.1007/978-3-319-92058-0_1
M3 - Conference contribution
AN - SCOPUS:85049012571
SN - 9783319920573
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 3
EP - 15
BT - Recent Trends and Future Technology in Applied Intelligence - 31st International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2018, Proceedings
A2 - Ait Mohamed, Otmane
A2 - Mouhoub, Malek
A2 - Sadaoui, Samira
A2 - Ali, Moonis
PB - Springer-Verlag
T2 - 31st International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems IEA/AIE 2018
Y2 - 25 June 2018 through 28 June 2018
ER -