Qwen2.5-1M Collection The long-context version of Qwen2.5, supporting 1M-token context lengths • 3 items • Updated Feb 26 • 118
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Paper • 2412.18925 • Published Dec 25, 2024 • 101
Deliberation in Latent Space via Differentiable Cache Augmentation Paper • 2412.17747 • Published Dec 23, 2024 • 33
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper • 2412.06559 • Published Dec 9, 2024 • 83
Llama-3.1-Nemotron-70B Collection SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated 3 days ago • 155
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming Paper • 2408.16725 • Published Aug 29, 2024 • 54
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search Paper • 2408.08152 • Published Aug 15, 2024 • 58
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 16 items • Updated 23 days ago • 146
Probably function calling datasets Collection Created using the https://huggingface.co/spaces/librarian-bots/dataset-column-search-api Space. • 39 items • Updated Jul 17, 2024 • 37
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models Jun 24, 2024 • 193
GenQA: Generating Millions of Instructions from a Handful of Prompts Paper • 2406.10323 • Published Jun 14, 2024 • 5
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback Paper • 2406.00888 • Published Jun 2, 2024 • 34
sentence-transformers-from-synthetic-data Collection Example of using distilabel to generate synthetic triplets data for fine-tuning a Sentence Transformer model • 4 items • Updated Jun 21, 2024 • 22