InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published 7 days ago • 232
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory. • 19 items • Updated 3 days ago • 18
MAI-DS-R1 Collection MAI-DS-R1 is a DeepSeek-R1 reasoning model that has been post-trained by the Microsoft AI team. • 2 items • Updated 4 days ago • 7
BitNet Collection 🔥BitNet family of large language models (1-bit LLMs). • 6 items • Updated 4 days ago • 28
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper • 2504.08685 • Published 10 days ago • 119
Apriel Collection ServiceNow Language Modeling Lab's first model family series • 2 items • Updated 7 days ago • 6
Kimi-VL-A3B Collection Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 6 items • Updated 9 days ago • 61
Granite 3.3 Language Models Collection Our latest language models licensed under Apache 2.0 license. • 4 items • Updated 5 days ago • 29
Orpheus Multilingual Research Release Collection Beta Release of multilingual models. • 12 items • Updated 11 days ago • 76
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published 14 days ago • 164