Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 101
RealHarm: A Collection of Real-World Language Model Application Failures Paper • 2504.10277 • Published 9 days ago • 11
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated 5 days ago • 158
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens Paper • 2503.01710 • Published Mar 3 • 5
More Documents, Same Length: Isolating the Challenge of Multiple Documents in RAG Paper • 2503.04388 • Published Mar 6 • 16
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 277
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published Jan 14 • 286