A method for the online construction of the set of states of a Markov decision process using answer set programming

Leonardo Anjoletto Ferreira, Reinaldo A.C. Bianchi, Paulo E. Santos, Ramon Lopez de Mantaras

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Non-stationary domains, that change in unpredicted ways, are a challenge for agents searching for optimal policies in sequential decision-making problems. This paper presents a combination of Markov Decision Processes (MDP) with Answer Set Programming (ASP), named Online ASP for MDP (oASP(MDP)), which is a method capable of constructing the set of domain states while the agent interacts with a changing environment. oASP(MDP) updates previously obtained policies, learnt by means of Reinforcement Learning (RL), using rules that represent the domain changes observed by the agent. These rules represent a set of domain constraints that are processed as ASP programs reducing the search space. Results show that oASP(MDP) is capable of finding solutions for problems in non-stationary domains without interfering with the action-value function approximation process.

Original languageEnglish
Title of host publicationRecent Trends and Future Technology in Applied Intelligence - 31st International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2018, Proceedings
EditorsOtmane Ait Mohamed, Malek Mouhoub, Samira Sadaoui, Moonis Ali
PublisherSpringer-Verlag
Pages3-15
Number of pages13
ISBN (Print)9783319920573
DOIs
Publication statusPublished - 30 May 2018
Externally publishedYes
Event31st International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems IEA/AIE 2018 - Montreal, Canada
Duration: 25 Jun 201828 Jun 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10868 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference31st International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems IEA/AIE 2018
CountryCanada
CityMontreal
Period25/06/1828/06/18

Fingerprint Dive into the research topics of 'A method for the online construction of the set of states of a Markov decision process using answer set programming'. Together they form a unique fingerprint.

Cite this