rsasaki0109 (@rsasaki0109)
2025-12-28 | โค๏ธ 138 | ๐ 21
DVGT: Driving Visual Geometry Transform DVGT, a universal visual geometry transformer for autonomous driving, directly predicts metric-scaled global 3D point maps from a sequence of unposed multi-view images, eliminating the need for post-alignment with external data. https://github.com/wzzheng/DVGT DVGT proposes a universal framework for driving geometry perception. Unlike conventional driving models that are tightly coupled to specific sensor setups or require ground-truth poses, our model leverages spatial-temporal attention to process unposed image sequences directly. By decoding global geometry in the ego-coordinate system, DVGT achieves metric-scaled dense reconstruction without LiDAR alignment, offering a robust solution that adapts seamlessly to diverse vehicles and camera configurations.
๐ ์๋ณธ ๋งํฌ
๋ฏธ๋์ด

๐ Related
- chain-of-view-makes-vision-language-models-move-through-a
- nano-banana-pro-can-generate-360-degree-visuals-so-i-wanted
- how-to-setup-a-multi-agent-system-bookmark-it-the-trading
- mvinverse-feed-forward-multi-view-inverse-rendering-in
- watch-the-awesome-4dgs-plugin-running-in-lichtfeld-studio