DiaTool-DPO: Multi-Turn Direct Preference Optimization for Tool-Augmented Large Language Models Paper • 2504.02882 • Published Apr 2 • 7