Anand Bhattad (@anand_bhattad)

2025-03-29 | ❤️ 383 | 🔁 81

[1/8] Is scene understanding solved?

We can label pixels and detect objects with high accuracy. But does that mean we truly understand scenes?

Super excited to share our new paper and a new task in computer vision: Visual Jenga!

🔗 원본 링크

입력 이미지에서 물체를 하나씩 제거하면서 장면의 일관성을 유지하는 ‘Visual Jenga’ 과제를 제안한 논문이다. 이미지 예시처럼 쌓인 그릇이 순차적으로 사라지는 반사실적 인페인팅을 통해, 객체 인식 정확도를 넘어 실제 장면 이해 수준을 평가하려는 접근이다.