Solving Safety Problems with Ensemble Reinforcement Learning

Leonardo A. Ferreira, Thiago F. dos Santos, Reinaldo A.C. Bianchi, Paulo E. Santos

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

An agent that learns by interacting with an environment may find unexpected solutions to decision-making problems. This solution can be an improvement over well-known ones, such as new strategies for games, but in some cases the unexpected solution is unwanted and should be avoided for reasons such as safety. This paper proposes a Reinforcement Learning Ensemble Framework called ReLeEF. This framework combines decision making methods to provide a finer grained control of the agent’s behaviour while still letting it learn by interacting with the environment. It has been tested in the safety gridworlds and the results show that it can find optimal solutions while fulfilling safety concerns described for each domain, something that state of the art Deep Reinforcement Learning methods were unable to do.

Original languageEnglish
Title of host publicationAI 2019
Subtitle of host publicationAdvances in Artificial Intelligence - 32nd Australasian Joint Conference, 2019, Proceedings
EditorsJixue Liu, James Bailey
Place of PublicationCham, Switzerland
PublisherSpringer
Pages203-214
Number of pages12
ISBN (Print)9783030352875
DOIs
Publication statusPublished - 2019
Event32nd Australasian Joint Conference on Artificial Intelligence, AI 2019 - Adelaide, Australia
Duration: 2 Dec 20195 Dec 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11919 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference32nd Australasian Joint Conference on Artificial Intelligence, AI 2019
Country/TerritoryAustralia
CityAdelaide
Period2/12/195/12/19

Keywords

  • Ontology
  • Reinforcement Learning
  • Safety

Fingerprint

Dive into the research topics of 'Solving Safety Problems with Ensemble Reinforcement Learning'. Together they form a unique fingerprint.

Cite this