What if we could model vision like a wave moving through space?
What if we could model vision like a wave moving through space?
Researchers from Peking & Tsinghua Universities present WaveFormer.
They treat image features as signals governed by a wave equation, explicitly controlling how low-to-high frequency details evolve across network layers.
This new Wave Propagation Operator outperforms standard Vision Transformers in image classification, detection, and segmentation, achieving up to 1.6x higher throughput with 30% fewer computations.
WaveFormer: Frequency-Time Decoupled Vision Modeling with Wave Equation
Paper: https://arxiv.org/abs/2601.08602 Code: https://github.com/ZishanShu/WaveFormer
Our report: https://mp.weixin.qq.com/s/xFoj94IIG4xjucJvew8ilQ
๐ฌ PapersAccepted by Jiqizhixin
๐ ์๋ณธ ๋งํฌ
- https://arxiv.org/abs/2601.08602
- https://github.com/ZishanShu/WaveFormer
- https://mp.weixin.qq.com/mp/wappoc_appmsgcaptcha?poc_token=HIoHh2mjJrxZC4nFbvSZypLWr6oRWcWq2-aVfTID&target_url=https%3A%2F%2Fmp.weixin.qq.com%2Fs%2FxFoj94IIG4xjucJvew8ilQ
๋ฏธ๋์ด
