DataDecide: How to Predict Best Pretraining Data with Small Experiments Paper • 2504.11393 • Published 8 days ago • 15
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens Paper • 2504.07096 • Published 14 days ago • 73
EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees Paper • 2503.08893 • Published Mar 11 • 5
Establishing Task Scaling Laws via Compute-Efficient Model Ladders Paper • 2412.04403 • Published Dec 5, 2024 • 3
Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens Paper • 2401.17377 • Published Jan 30, 2024 • 38