ReTool: Reinforcement Learning for Strategic Tool Use in LLMs Paper • 2504.11536 • Published 5 days ago • 53
Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models Paper • 2504.03624 • Published 16 days ago • 13
OpenCodeReasoning: Advancing Data Distillation for Competitive Coding Paper • 2504.01943 • Published 18 days ago • 13
OpenCodeReasoning Collection Reasoning data for supervised finetuning of LLMs to advance data distillation for competitive coding • 5 items • Updated 6 days ago • 7
Orpheus Multilingual Research Release Collection Beta Release of multilingual models. • 12 items • Updated 10 days ago • 76
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published 13 days ago • 163
Llama 4 Collection Meta's new Llama 4 multimodal models, Scout & Maverick. Includes Dynamic GGUFs, 16-bit & Dynamic 4-bit uploads. Run & fine-tune them with Unsloth! • 15 items • Updated 4 days ago • 43
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning Paper • 2503.19470 • Published 27 days ago • 17
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated 3 days ago • 147