A compact 0.4B Vision-Language-Action model that finally lets robots manipulate moving objects in real-time, closing t
DynamicVLA
A compact 0.4B Vision-Language-Action model that finally lets robots manipulate moving objects in real-time, closing the perception-execution gap with Continuous Inference and Latent-aware Action Streaming. https://x.com/HuggingPapers/status/2017094507402318169/video/1
๐ ์๋ณธ ๋งํฌ
๋ฏธ๋์ด
![]()