๐Ÿ“š ์„ธํ˜„'s Vault

๐ŸŒ ๋„๋ฉ”์ธ

  • ๐Ÿ”ฎ3D-Vision
  • ๐ŸŽจRendering
  • ๐Ÿค–Robotics
  • ๐Ÿง LLM
  • ๐Ÿ‘๏ธVLM
  • ๐ŸŽฌGenAI
  • ๐ŸฅฝXR
  • ๐ŸŽฎSimulation
  • ๐Ÿ› ๏ธDev-Tools
  • ๐Ÿ’ฐCrypto
  • ๐Ÿ“ˆFinance
  • ๐Ÿ“‹Productivity
  • ๐Ÿ“ฆ๊ธฐํƒ€

๐Ÿ“„ Papers

  • ๐Ÿ“š์ „์ฒด ๋…ผ๋ฌธ172
Home

โฏ

bookmarks

โฏ

RayRoPE: Projective Ray Positional Encoding for Multi view Attention

RayRoPE: Projective Ray Positional Encoding for Multi-view Attention

2026๋…„ 2์›” 05์ผ1 min read

  • 3D-Vision
  • multi-view-transformer

Shubham Tulsiani (@shubhtuls)

2026-02-05 | โค๏ธ 730 | ๐Ÿ” 91


[1/N] Rotary Position Embeddings (RoPE) are ubiquitous across transformers that process tokens from 1D, 2D, or 3D grids e.g. language, images, or videos. Our RayRoPE formulation extends these to multi-view transformers. Paper and code: https://t.co/abVobLRJxq https://t.co/cYhczUqrGc

๋ฏธ๋””์–ด

video thumbnail


Tags

3D-Vision multi-view-transformer


๊ทธ๋ž˜ํ”„ ๋ทฐ

  • Shubham Tulsiani (@shubhtuls)
  • ๋ฏธ๋””์–ด
  • Tags

Created with Quartz v4.5.2 ยฉ 2026

  • GitHub
  • Sehyeon Park