DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 116
Consent by Design: Approaches to User Data in Open AI Ecosystems By giadap and 1 other • 6 days ago • 9
Optimise AI Models and Make Them Faster, Smaller, Cheaper, Greener By PrunaAI and 2 others • 19 days ago • 18
Framepack : groundbreaking video generation technology whose model size is only 0.13B parameters. By LLMhacker • 4 days ago • 6
Fine-tuning SmolLM with Group Relative Policy Optimization (GRPO) by following the Methodologies By prithivMLmods • Feb 17 • 19
What is The Agent2Agent Protocol (A2A) and Why You Must Learn It Now By lynn-mikami • 11 days ago • 7
Metric and Relative Monocular Depth Estimation: An Overview. Fine-Tuning Depth Anything V2 👐 📚 By Isayoften • Jul 10, 2024 • 61
DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 116
Consent by Design: Approaches to User Data in Open AI Ecosystems By giadap and 1 other • 6 days ago • 9
Optimise AI Models and Make Them Faster, Smaller, Cheaper, Greener By PrunaAI and 2 others • 19 days ago • 18
Framepack : groundbreaking video generation technology whose model size is only 0.13B parameters. By LLMhacker • 4 days ago • 6
Fine-tuning SmolLM with Group Relative Policy Optimization (GRPO) by following the Methodologies By prithivMLmods • Feb 17 • 19
What is The Agent2Agent Protocol (A2A) and Why You Must Learn It Now By lynn-mikami • 11 days ago • 7
Metric and Relative Monocular Depth Estimation: An Overview. Fine-Tuning Depth Anything V2 👐 📚 By Isayoften • Jul 10, 2024 • 61