📚 세현's Vault

🌍 도메인

🔮3D-Vision
🎨Rendering
🤖Robotics
🧠LLM
👁️VLM
🎬GenAI
🥽XR
🎮Simulation
🛠️Dev-Tools
💰Crypto
📈Finance
📋Productivity
📦기타

📄 Papers

📚전체 논문172

❯

❯

mllms are great at understanding videos but struggle with sp

mllms-are-great-at-understanding-videos-but-struggle-with-sp

2025년 11월 07일1 min read

3D-Vision
rendering

Ellis Brown (@_ellisbrown)

2025-11-07 | ❤️ 237 | 🔁 47

MLLMs are great at understanding videos, but struggle with spatial reasoning—like estimating distances or tracking objects across time.

the bottleneck? getting precise 3D spatial annotations on real videos is expensive and error-prone.

introducing SIMS-V 🤖

[1/n] https://x.com/_ellisbrown/status/1986904352506667479/video/1

🔗 원본 링크

https://x.com/_ellisbrown/status/1986904352506667479/video/1

미디어

Tags

그래프 뷰

Ellis Brown (@_ellisbrown)
🔗 원본 링크
미디어
Tags

백링크

domain-3D-Vision

Created with Quartz v4.5.2 © 2026

GitHub
Sehyeon Park