Safetensors
English
qwen2
xiao23451 commited on
Commit
5d3772d
·
verified ·
1 Parent(s): 16aa77b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -7
README.md CHANGED
@@ -1,9 +1,3 @@
1
- GPG: A Simple and Strong Reinforcement Learning
2
- Baseline for Model Reasoning
3
- https://arxiv.org/abs/2504.02546
4
-
5
- The RL model trained on the Open-r1 dataset based on GPG, using DeepSeek-R1-Distill-Qwen-1.5B as the baseline model.
6
-
7
  ---
8
  license: apache-2.0
9
  datasets:
@@ -14,4 +8,9 @@ metrics:
14
  - accuracy
15
  base_model:
16
  - deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
17
- ---
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  datasets:
 
8
  - accuracy
9
  base_model:
10
  - deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
11
+ ---
12
+ GPG: A Simple and Strong Reinforcement Learning
13
+ Baseline for Model Reasoning
14
+ https://arxiv.org/abs/2504.02546
15
+
16
+ The RL model trained on the Open-r1 dataset based on GPG, using DeepSeek-R1-Distill-Qwen-1.5B as the baseline model.