Towards Dynamic Theory of Mind: Evaluating LLM Adaptation to Temporal Evolution of Human States Paper • 2505.17663 • Published 15 days ago • 14
LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling Paper • 2505.19187 • Published 13 days ago • 12
Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale Paper • 2409.17115 • Published Sep 25, 2024 • 63
Running on CPU Upgrade 13.1k 13.1k Open LLM Leaderboard 🏆 Track, rank and evaluate open LLMs and chatbots
CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark Paper • 2401.11944 • Published Jan 22, 2024 • 28
CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark Paper • 2401.11944 • Published Jan 22, 2024 • 28