Chelsea Finn (@chelseabfinn)

2026-01-24 | โค๏ธ 506 | ๐Ÿ” 48 | ๐Ÿ’ฌ 8


Video models serve as a good pretrained backbone for robot policies.

Paper: https://arxiv.org/abs/2601.16163 Code: https://github.com/nvlabs/cosmos-policy

๐Ÿ”— ์›๋ณธ ๋งํฌ

์ธ์šฉ๋œ ํŠธ์œ—

@moo_jin_kim: We release Cosmos Policy ๐Ÿ’ซ: a state-of-the-art robot policy built on a video diffusion model backbone.

  • policy + world model + value function โ€” in 1 model
  • no architectural changes to the base videโ€ฆ

์ธ์šฉ ํŠธ์œ—

Moo Jin Kim (@moo_jin_kim)

We release Cosmos Policy ๐Ÿ’ซ: a state-of-the-art robot policy built on a video diffusion model backbone.

  • policy + world model + value function โ€” in 1 model
  • no architectural changes to the base video model
  • SOTA in LIBERO (98.5%), RoboCasa (67.1%), & ALOHA tasks (93.6%)

๐Ÿงต๐Ÿ‘‡ https://t.co/cz9L3ziJ6x

์›๋ณธ ํŠธ์œ—

๐ŸŽฌ ์˜์ƒ

Tags

Robotics GenAI AI-ML Dev-Tools