Lifelong Incremental Reinforcement Learning With Online Bayesian Inference

Overview

Journal IEEE Trans Neural Netw Learn Syst

Specialties Biomedical Engineering
Medical Informatics

Date 2021 Feb 11

PMID 33571098

Citations 2

Authors

Zhi Wang

Chunlin Chen

Daoyi Dong

Affiliations

Soon will be listed here.

Abstract

A central capability of a long-lived reinforcement learning (RL) agent is to incrementally adapt its behavior as its environment changes and to incrementally build upon previous experiences to facilitate future learning in real-world scenarios. In this article, we propose lifelong incremental reinforcement learning (LLIRL), a new incremental algorithm for efficient lifelong adaptation to dynamic environments. We develop and maintain a library that contains an infinite mixture of parameterized environment models, which is equivalent to clustering environment parameters in a latent space. The prior distribution over the mixture is formulated as a Chinese restaurant process (CRP), which incrementally instantiates new environment models without any external information to signal environmental changes in advance. During lifelong learning, we employ the expectation-maximization (EM) algorithm with online Bayesian inference to update the mixture in a fully incremental manner. In EM, the E-step involves estimating the posterior expectation of environment-to-cluster assignments, whereas the M-step updates the environment parameters for future learning. This method allows for all environment models to be adapted as necessary, with new models instantiated for environmental changes and old models retrieved when previously seen environments are encountered again. Simulation experiments demonstrate that LLIRL outperforms relevant existing methods and enables effective incremental adaptation to various dynamic environments for lifelong learning.

Citing Articles

Incremental Learning of Goal-Directed Actions in a Dynamic Environment by a Robot Using Active Inference.

Matsumoto T, Ohata W, Tani J Entropy (Basel). 2023; 25(11).

PMID: 37998198 PMC: 10670890. DOI: 10.3390/e25111506.

An framework for modeling optimal control of neural systems.

Rueckauer B, van Gerven M Front Neurosci. 2023; 17:1141884.

PMID: 36968496 PMC: 10030734. DOI: 10.3389/fnins.2023.1141884.