Xiao Fu (@lemonaddie0909)

2026-01-12 | ❤️ 369 | 🔁 54

Video generation, but 4D, dynamic, scene-consistent, and very long at the same time?!

Introducing 𝐏𝐥𝐞𝐧𝐨𝐩𝐭𝐢𝐜𝐃𝐫𝐞𝐚𝐦𝐞𝐫, 𝐦𝐮𝐥𝐭𝐢-𝐯𝐢𝐞𝐰 𝐯𝐢𝐝𝐞𝐨 𝐠𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐨𝐧 𝐰𝐢𝐭𝐡 𝐥𝐨𝐧𝐠-𝐭𝐞𝐫𝐦 𝐬𝐩𝐚𝐭𝐢𝐨-𝐭𝐞𝐦𝐩𝐨𝐫𝐚𝐥 𝐦𝐞𝐦𝐨𝐫𝐲! The scaling secret is very simple: an autoregressive paradigm with minimal 3D inductive bias, aided with a spatially grounded memory retrieval mechanism.