BEINGS: Bayesian Embodied Image-goal Navigation with Gaussian Splatting

The Hong Kong University of Science and Technology
ICRA 2025

*Corresponding author

A case study of long-distance ImageNav using BEINGS. We replace the real visual measurements, with the rendered images from the current robot pose in the 3DGS map. The BEINGS method successfully guides the robot to explore the environment, navigate around obstacles, and reach the target finally.

Abstract

Image-goal navigation enables a robot to reach the location where a target image was captured, using visual cues for guidance. However, current methods either rely heavily on data and computationally expensive learning-based approaches or lack efficiency in complex environments due to insufficient exploration strategies. To address these limitations, we propose Bayesian Embodied Image-goal Navigation Using Gaussian Splatting, a novel method that formulates ImageNav as an optimal control problem within a model predictive control framework. BEINGS leverages 3D Gaussian Splatting as a scene prior to predict future observations, enabling efficient, real-time navigation decisions grounded in the robot’s sensory experiences. By integrating Bayesian updates, our method dynamically refines the robot's strategy without requiring extensive prior experience or data. Our algorithm is validated through extensive simulations and physical experiments, showcasing its potential for embodied robot systems in visually complex scenarios.

Experiments