CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training Paper • 2504.13161 • Published 9 days ago • 87
Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarsity Paper • 2502.11901 • Published Feb 17 • 6
Only-IF:Revealing the Decisive Effect of Instruction Diversity on Generalization Paper • 2410.04717 • Published Oct 7, 2024 • 18