RPG360: Robust 360 Depth Estimation with Perspective Foundation Models and Graph Optimization

Dongki Jung Jaehoon Choi Yonghan Lee Dinesh Manocha,
1University of Maryland, College Park

NeurIPS 2025

RPG360

RPG360 leverages the prior knowledge of the perspective foudnation model along with graph optimization, thereby enchances 3D structural awareness and achieves superior performance.

Abstract

The increasing use of 360° images across various domains has emphasized the need for robust depth estimation techniques tailored for omnidirectional images. However, obtaining large-scale labeled datasets for 360° depth estimation remains a significant challenge. In this paper, we propose RPG360, a training-free 360° monocular depth estimation method that leverages perspective foundation models. Our approach converts 360° images into six-face cubemap representations, where a perspective foundation model is employed to estimate depth and surface normals. To address depth scale inconsistencies across different faces of the cubemap, we introduce a novel depth scale alignment technique using graph-based optimization, which parameterizes the predicted depth and normal maps while incorporating an additional per-face scale parameter. This optimization ensures depth scale consistency across the six-face cubemap while preserving 3D structural integrity. Furthermore, as foundation models exhibit inherent robustness in zero-shot settings, our method achieves superior performance across diverse datasets, including Matterport3D, Stanford2D3D, and 360Loc. We also demonstrate the versatility of our depth estimation approach by validating its benefits in downstream tasks such as feature matching 3.2 - 5.4% and Structure from Motion 0.2 - 9.7% in AUC@5.

Process Visualization


Visualization of RPG360 on the Matterport3D dataset.


Visualization of RPG360 on the 360Loc dataset.

Proposed Method
RPG360
We introduce a novel depth scale alignment technique using graph-based optimization, which parameterizes the predicted depth and normal maps while incorporating an additional per-face scale parameter.

Results

RPG360

Qualitative comparisons. Our method demonstrates superior structural accuracy and robustness compared to others.
RPG360

We demonstrate the versatility of RPG360 by applying it to omnidirectional vision downstream tasks.

Point cloud generated by RPG360