MiniCPM4 Collection MiniCPM4: Ultra-Efficient LLMs on End Devices • 17 items • Updated about 9 hours ago • 45
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning Paper • 2505.17667 • Published 18 days ago • 86
Distilling LLM Agent into Small Models with Retrieval and Code Tools Paper • 2505.17612 • Published 18 days ago • 77
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures Paper • 2505.09343 • Published 27 days ago • 64
view article Article Good answers are not necessarily factual answers: an analysis of hallucination in leading LLMs By davidberenstein1957 and 1 other • May 7 • 35
view article Article Mixture of Tunable Experts - Behavior Modification of DeepSeek-R1 at Inference Time By rbrt and 4 others • Feb 18 • 33
Pleias-RAG Collection New generation of small reasoning models for RAG, search, and source summarization. • 4 items • Updated Apr 24 • 27
view article Article Finetuning olmOCR to be a faithful OCR-Engine By tngtech and 1 other • Apr 22 • 18
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated 11 days ago • 197
Llama 4 Collection Meta's new Llama 4 multimodal models, Scout & Maverick. Includes Dynamic GGUFs, 16-bit & Dynamic 4-bit uploads. Run & fine-tune them with Unsloth! • 15 items • Updated 11 days ago • 46
🌙 March 2025 - Open releases from the Chinese community Collection 32 items • Updated 25 days ago • 13
view article Article Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques By jmamou and 8 others • Mar 24 • 18
Gemma 3 QAT INT4 (from Flax) Collection These are converted from the official QAT INT4 Flax checkpoints on Kaggle. Supported formats: AutoAWQ, GGUF • 12 items • Updated Apr 6 • 5
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Paper • 2503.11576 • Published Mar 14 • 108