Philippe Hansen-Estruch (@tokenpilled65B)

2026-01-20 | โค๏ธ 413 | ๐Ÿ” 58


Releasing ViTok-v2: open-source ViT auto-encoder codebase + pretrained weights

Train your own ViT auto-encoder on any streamed (hf://) or local webdataset. NaFlex pipeline handles any resolution and aspect ratio

Includes reproduced 350M and 4.5B models weights competitive at 256p, SOTA at high-res (512p+)

๋ฏธ๋””์–ด

image


Tags

LLM RAG code-gen