ๆบๅจไนๅฟ JIQIZHIXIN (@jiqizhixin)
2026-01-25 | โค๏ธ 299 | ๐ 44 | ๐ฌ 6
What if we could model vision like a wave moving through space?
Researchers from Peking & Tsinghua Universities present WaveFormer.
They treat image features as signals governed by a wave equation, explicitly controlling how low-to-high frequency details evolve across network layers.
This new Wave Propagation Operator outperforms standard Vision Transformers in image classification, detection, and segmentation, achieving up to 1.6x higher throughput with 30% fewer computations.
WaveFormer: Frequency-Time Decoupled Vision Modeling with Wave Equation
Paper: https://arxiv.org/abs/2601.08602 Code: https://github.com/ZishanShu/WaveFormer
Our report: https://mp.weixin.qq.com/s/xFoj94IIG4xjucJvew8ilQ
๐ฌ PapersAccepted by Jiqizhixin
๐ ์๋ณธ ๋งํฌ
- https://arxiv.org/abs/2601.08602
- https://github.com/ZishanShu/WaveFormer
- https://mp.weixin.qq.com/s/xFoj94IIG4xjucJvew8ilQ
๋ฏธ๋์ด

๐ Related
- video-models-serve-as-a-good-pretrained-backbone-for-robot โ ์ฃผ์ : AI-ML, Dev-Tools
- introducing-shaper-a-method-for-robust-conditional-3d-shape โ ์ฃผ์ : AI-ML, Dev-Tools
- what-if-we-could-train-ai-robots-in-a-perfect-physics โ ์ฃผ์ : AI-ML, Dev-Tools
- you-could-direct-a-video-like-a-real-3d-world-researchers โ ์ฃผ์ : AI-ML, Dev-Tools
- releasing-vitok-v2-open-source-vit-auto-encoder-codebase โ ์ฃผ์ : AI-ML, Dev-Tools