Jie Wang (@JieWang_ZJUI)
2026-01-13 | โค๏ธ 355 | ๐ 51
VLAs nowadays enable robotic manipulation to perform impressive tasks like folding clothes, making coffee, and cleaning dishes. However, surprisingly, most VLAs lack memory. Unlike their close relatives LLMs, VLAs have no context window and no access to history. This causes them to repeatedly fail in the same way without learning from online experience.
But why? Why not simply extend the context window like LLMs? Itโs not that we donโt want to โ itโs because itโs extremely difficult. Here, I share a talk by @chelseabfinn at NeurIPS that scope the challenges in developing long-horizon autonomy for embodied agents. At the end, thereโs a reading list on memory for robotics. โญ
๋ฏธ๋์ด

๐ Related
Auto-generated - needs manual review