Shubham Tulsiani (@shubhtuls)
2026-02-05 | โค๏ธ 529 | ๐ 66 | ๐ฌ 3
[1/N] Rotary Position Embeddings (RoPE) are ubiquitous across transformers that process tokens from 1D, 2D, or 3D grids e.g. language, images, or videos. Our RayRoPE formulation extends these to multi-view transformers. Paper and code: https://rayrope.github.io/ https://x.com/shubhtuls/status/2019470333594468357/video/1
๐ ์๋ฌธ ๋ด์ฉ
RayRoPE: Projective Ray Positional Encoding for Multi-view Attention
RayRoPE: Projective Ray Positional Encoding for Multi-view Attention
๋ฏธ๋์ด
๐ฌ ์์
๐ Related
- 1n-rotary-position-embeddings-rope-are-ubiquitous-across-transformers-that โ ์ฃผ์ : Reconstruction, Transformer, Rope
- vggt-slam-20-real-time-dense-feed-forward-scene-reconstructi โ ๋๋ฉ์ธ: Vision/3D
- introducing-360anything-our-method-for-lifting-any-perspecti โ ๋๋ฉ์ธ: Vision/3D
- eag-pt-emission-aware-gaussians-and-path-tracing-for-indoorโ281012 โ ์ฃผ์ : Reconstruction, Rope
- iggt-instance-grounded-geometry-transformer โ ์ฃผ์ : Reconstruction, Transformer