📚 세현's Vault

🌍 도메인

🔮3D-Vision
🎨Rendering
🤖Robotics
🧠LLM
👁️VLM
🎬GenAI
🥽XR
🎮Simulation
🛠️Dev-Tools
💰Crypto
📈Finance
📋Productivity
📦기타

📄 Papers

📚전체 논문172

❯

❯

UniVLG: Unifying 2D and 3D Vision Language Understanding

UniVLG: Unifying 2D and 3D Vision-Language Understanding

2025년 4월 23일1 min read

3D-Vision
VLM

Ayush Jain (@ayushjain1144)

2025-04-23 | ❤️ 136 | 🔁 28

1/ Despite having access to rich 3D inputs, embodied agents still rely on 2D VLMs—due to the lack of large-scale 3D data and pre-trained 3D encoders.

We introduce UniVLG, a unified 2D-3D VLM that leverages 2D scale to improve 3D scene understanding. https://univlg.github.io/ https://x.com/ayushjain1144/status/1915087087486828854/video/1

🔗 원본 링크

https://univlg.github.io/
https://x.com/ayushjain1144/status/1915087087486828854/video/1

미디어

video

요약

대규모 3D 데이터·사전학습 3D 인코더 부족으로 2D VLM에 의존하던 한계를 겨냥해, 2D와 3D를 통합한 VLM(UniVLG)을 제안합니다. 2D 데이터의 스케일을 활용해 3D 장면 이해 성능을 높이는 것이 핵심입니다.

🔗 Related

Auto-generated - needs manual review

Tags

domain-vision-3d domain-robotics domain-vlm domain-dev-tools domain-visionos

그래프 뷰

Ayush Jain (@ayushjain1144)
🔗 원본 링크
미디어
요약
🔗 Related
Tags

백링크

domain-3D-Vision

Created with Quartz v4.5.2 © 2026

GitHub
Sehyeon Park