Sibyl: Simple yet Effective Agent Framework for Complex Real-world Reasoning Paper • 2407.10718 • Published Jul 15, 2024 • 19
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published 11 days ago • 116
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures Paper • 2505.09343 • Published 25 days ago • 64
Reinforcement Learning for Reasoning in Large Language Models with One Training Example Paper • 2504.20571 • Published Apr 29 • 94
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published Apr 18 • 127
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback Paper • 2503.22230 • Published Mar 28 • 44
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning Paper • 2503.05592 • Published Mar 7 • 27
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models Paper • 2502.01061 • Published Feb 3 • 216
Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step Paper • 2501.13926 • Published Jan 23 • 42
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 401
GameFactory: Creating New Games with Generative Interactive Videos Paper • 2501.08325 • Published Jan 14 • 66
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published Jan 14 • 293
Exploring the Inquiry-Diagnosis Relationship with Advanced Patient Simulators Paper • 2501.09484 • Published Jan 16 • 19
LLaMA-Omni: Seamless Speech Interaction with Large Language Models Paper • 2409.06666 • Published Sep 10, 2024 • 58
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale Paper • 2409.08264 • Published Sep 12, 2024 • 49