Domain: VLM (16)
grounding (6)
- this-is-a-phenomenal-video-by-jbhuang0604-explaining-seminal โ this-is-a-phenomenal-video-by-jbhuang0604-explaining-seminal
- can-vlms-really-think-in-4d-3d-space-time-when-a-model-cant โ can-vlms-really-think-in-4d-3d-space-time-when-a-model-cant
- thinking-with-camera-just-dropped-on-hugging-face-a-unified โ thinking-with-camera-just-dropped-on-hugging-face-a-unified
- thrilled-to-announce-our-icml25-paper-why-is-spatial โ thrilled-to-announce-our-icml25-paper-why-is-spatial
- florence2-sam2-is-officially-my-most-liked-space โ florence2-sam2-is-officially-my-most-liked-space
- fastembed-030-is-here-now-featuring-image-embeddings-resnet5 โ fastembed-030-is-here-now-featuring-image-embeddings-resnet5
VQA (6)
- oh-boy-a-2b-vision-model-seriously-damn โ oh-boy-a-2b-vision-model-seriously-damn
- excited-to-share-our-new-work-streamingvlm-we-tackle-a โ excited-to-share-our-new-work-streamingvlm-we-tackle-a
- streamingvlm-real-time-understanding-for-infinite-video โ streamingvlm-real-time-understanding-for-infinite-video
- finally-our-report-of-incentivizing-reasoning-in-vlms-is โ finally-our-report-of-incentivizing-reasoning-in-vlms-is
- really-great-use-of-multimodal-llms-to-analyze-a-massive โ really-great-use-of-multimodal-llms-to-analyze-a-massive
- ๐๐ข๐ง๐ข๐๐๐๐-๐๐ข๐๐๐จ-gradio-demo-is-now-available-on-the-spaces-a โ ๐๐ข๐ง๐ข๐๐๐๐-๐๐ข๐๐๐จ-gradio-demo-is-now-available-on-the-spaces-a
segmentation (2)
- harvard-just-dropped-a-book-on-ml-systems-engineering-and โ harvard-just-dropped-a-book-on-ml-systems-engineering-and
- i-spent-3-hours-today-trying-to-reverse-engineer-the-data-format-required-to-fin โ i-spent-3-hours-today-trying-to-reverse-engineer-the-data-format-required-to-fin
web-graphics (1)
- experiment-with-a-multimodal-llm-in-webar-the-ai-identifies โ experiment-with-a-multimodal-llm-in-webar-the-ai-identifies
visionos (1)
- announcing-mistralai-ocr-the-worlds-best-document-understanding-api โ announcing-mistralai-ocr-the-worlds-best-document-understanding-api