Surrogate Signals from Format and Length: Reinforcement Learning for Solving Mathematical Problems without Ground Truth Answers Paper • 2505.19439 • Published 6 days ago • 30
Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models Paper • 2505.18536 • Published 8 days ago • 18
MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search Paper • 2505.19209 • Published 6 days ago • 24
Alchemist: Turning Public Text-to-Image Data into Generative Gold Paper • 2505.19297 • Published 6 days ago • 72
Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps Paper • 2505.18675 • Published 7 days ago • 23
Deciphering Trajectory-Aided LLM Reasoning: An Optimization Perspective Paper • 2505.19815 • Published 5 days ago • 36
B-score: Detecting biases in large language models using response history Paper • 2505.18545 • Published 8 days ago • 29
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles Paper • 2505.19914 • Published 5 days ago • 39
Mutarjim: Advancing Bidirectional Arabic-English Translation with a Small Language Model Paper • 2505.17894 • Published 8 days ago • 205
Embodied Agents Meet Personalization: Exploring Memory Utilization for Personalized Assistance Paper • 2505.16348 • Published 10 days ago • 43
BizFinBench: A Business-Driven Real-World Financial Benchmark for Evaluating LLMs Paper • 2505.19457 • Published 6 days ago • 60
Shifting AI Efficiency From Model-Centric to Data-Centric Compression Paper • 2505.19147 • Published 6 days ago • 139