view article Article Vision Language Models (Better, Faster, Stronger) By merve and 4 others β’ 5 days ago β’ 291
view article Article Gotchas in Tokenizer Behavior Every Developer Should Know By qgallouedec β’ 28 days ago β’ 35
Hf-native ColVision Models Collection Models that can be used with the native transformers π€ implementation instead of colpali-engine. β’ 3 items β’ Updated about 1 month ago β’ 3
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper β’ 2504.10479 β’ Published Apr 14 β’ 255
SmolVLM: Redefining small and efficient multimodal models Paper β’ 2504.05299 β’ Published Apr 7 β’ 181
view article Article ViDoRe Benchmark V2: Raising the Bar for Visual Retrieval By manu and 2 others β’ Mar 18 β’ 9