skip to main content
article
Free access

Mitigating Value Hallucination in Dyna-Style Planning via Multistep Predecessor Models

Published: 09 June 2024 Publication History

Abstract

Dyna-style reinforcement learning (RL) agents improve sample efficiency over model-free RL agents by updating the value function with simulated experience generated by an environment model. However, it is often difficult to learn accurate models of environment dynamics, and even small errors may result in failure of Dyna agents. In this paper, we highlight that one potential cause of that failure is bootstrapping off of the values of simulated states, and introduce a new Dyna algorithm to avoid this failure. We discuss a design space of Dyna algorithms, based on using successor or predecessor models---simulating forwards or backwards---and using one-step or multi-step updates. Three of the variants have been explored, but surprisingly the fourth variant has not: using predecessor models with multi-step updates. We present the \emph{Hallucinated Value Hypothesis} (HVH): updating the values of real states towards values of simulated states can result in misleading action values which adversely affect the control policy. We discuss and evaluate all four variants of Dyna amongst which three update real states toward simulated states --- so potentially toward hallucinated values --- and our proposed approach, which does not. The experimental results provide evidence for the HVH, and suggest that using predecessor models with multi-step updates is a fruitful direction toward developing Dyna algorithms that are more robust to model error.

Index Terms

  1. Mitigating Value Hallucination in Dyna-Style Planning via Multistep Predecessor Models
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Journal of Artificial Intelligence Research
      Journal of Artificial Intelligence Research  Volume 80, Issue
      Sep 2024
      1696 pages

      Publisher

      AI Access Foundation

      El Segundo, CA, United States

      Publication History

      Published: 09 June 2024
      Published in JAIR Volume 80

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 60
        Total Downloads
      • Downloads (Last 12 months)60
      • Downloads (Last 6 weeks)24
      Reflects downloads up to 24 Oct 2024

      Other Metrics

      Citations

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Full Access

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media