Lyte commited on
Commit
784084a
·
verified ·
1 Parent(s): 1f26120

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -23,7 +23,7 @@ language:
23
 
24
  ## 🎮 Overview
25
 
26
- QuadConnect2.5-0.5B is a specialized language model trained to master the game of Connect Four. Built on Qwen 2.5 (0.5B parameter base), this model uses GRPO (Gradient-based Reward Policy Optimization) to learn the strategic intricacies of Connect Four gameplay.
27
 
28
  **Status**: Early training experiments (v0.0.9b) - Reward functions still evolving
29
 
 
23
 
24
  ## 🎮 Overview
25
 
26
+ QuadConnect2.5-0.5B is a specialized language model trained to master the game of Connect Four. Built on Qwen 2.5 (0.5B parameter base), this model uses GRPO (Group Relative Policy Optimization) to learn the strategic intricacies of Connect Four gameplay.
27
 
28
  **Status**: Early training experiments (v0.0.9b) - Reward functions still evolving
29