The collection for the Paper "SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild"
HKUST NLP Group
university
AI & ML interests
None defined yet.
Recent Activity
View all activity
Collections
7
models
45
hkust-nlp/Llama-3.1-8B-SimpleRL-Zoo
Updated
•
48
hkust-nlp/Qwen-2.5-32B-SimpleRL-Zoo
Updated
•
201
hkust-nlp/Qwen-2.5-7B-SimpleRL-Zoo
Updated
•
900
hkust-nlp/DeepSeek-Math-7B-SimpleRL-Zoo
Updated
•
74
hkust-nlp/Mistral-7B-v0.1-SimpleRL-Zoo
Updated
•
35
hkust-nlp/Qwen-2.5-1.5B-SimpleRL-Zoo
Updated
•
1.41k
hkust-nlp/Qwen-2.5-0.5B-SimpleRL-Zoo
Updated
•
57
hkust-nlp/Qwen-2.5-14B-SimpleRL-Zoo
Updated
•
191
hkust-nlp/Mistral-Small-24B-SimpleRL-Zoo
Updated
•
41
hkust-nlp/Qwen-2.5-Math-7B-SimpleRL-Zoo
Updated
•
3.49k
datasets
23
hkust-nlp/GUIMid
Viewer
•
Updated
•
1.7M
•
128
•
2
hkust-nlp/SimpleRL-Zoo-Data
Viewer
•
Updated
•
53.1k
•
1.23k
•
4
hkust-nlp/PreSelect-100B
Viewer
•
Updated
•
54.5M
•
983
•
9
hkust-nlp/CodeIO-PyEdu-Reasoning
Preview
•
Updated
•
159
•
49
hkust-nlp/CodeIO-PyEdu-Reasoning-Raw
Updated
•
46
hkust-nlp/SynCSE-partial-NLI
Viewer
•
Updated
•
263k
•
30
•
2
hkust-nlp/SynCSE-scratch-NLI
Viewer
•
Updated
•
276k
•
23
•
2
hkust-nlp/gsm8k-fix
Viewer
•
Updated
•
7.47k
•
86
•
2
hkust-nlp/dart-math-uniform
Viewer
•
Updated
•
591k
•
65
•
9
hkust-nlp/vrt-baseline
Viewer
•
Updated
•
591k
•
40
•
1