ViPE: Video Pose Engine for 3D Geometric Perception
Contributions: โข A robust and efficient framework, ViPE, for estimating camera parameters and dense depth from diverse, in-the-wild videos.
โข A system design that integrates the strengths of classical SLAM (efficiency, scalability) and learned models (robustness), with key improvements in efficiency, dynamic object handling, and depth quality over prior work.
โข A large-scale dataset of annotated videos, created using ViPE, to facilitate future research in 3D computer vision.