Running 2.66k 2.66k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
100 Days After DeepSeek-R1: A Survey on Replication Studies and More Directions for Reasoning Language Models Paper • 2505.00551 • Published May 1 • 37
ModernGLiNER Collection GLiNER models based on modern encoder architectures • 2 items • Updated Dec 24, 2024 • 7