Hogwild! Inference: Parallel LLM Generation via Concurrent Attention Paper • 2504.06261 • Published 17 days ago • 104
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks Paper • 2504.05118 • Published 18 days ago • 25
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 40 items • Updated Nov 28, 2024 • 308
BASS: Batched Attention-optimized Speculative Sampling Paper • 2404.15778 • Published Apr 24, 2024 • 10