The Bitter Lesson Learned from 2,000+ Multilingual Benchmarks Paper • 2504.15521 • Published 2 days ago • 53
Light-R1 Collection Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond • 7 items • Updated Mar 13 • 11
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 277