Intermediate stuff for tool using
RLAIF
Enterprise
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
Collections
1
models
9
RLAIF/Qwen3b-GRPO
Updated
•
13
RLAIF/llama-3b-open-r1-50k-sft
Updated
•
225
RLAIF/sft-external
Text Generation
•
Updated
RLAIF/sft-llama-3.1-8b-external
Text Generation
•
Updated
RLAIF/sft-gemma-2-9b-base-sft-llama-405b-instruct-correct-only-format-lr-5e-06-bs-64
Text Generation
•
Updated
RLAIF/sft-llama8b-prm-800k-correct-only
Text Generation
•
Updated
RLAIF/22-sequential-temp-0-verifier-no-best-oracle-in-context-train-8
Updated
RLAIF/22-sequential-temp-0-verifier-oracle-in-context-train-8-w-error-masking
Updated
RLAIF/15-w-error-masking-temp-0-verifier-in-context-train-in-context-inference-8-model
Updated
datasets
26
RLAIF/mbpp
Viewer
•
Updated
•
1.4k
•
39
RLAIF/STAR-TRAIN-math_llama-star-iter5
Viewer
•
Updated
•
3.31k
•
36
RLAIF/STAR-TRAIN-math_lama-star-iter4
Viewer
•
Updated
•
3.27k
•
35
RLAIF/STAR-TRAIN-math_llama-star-iter3
Viewer
•
Updated
•
3.2k
•
35
RLAIF/STAR-TRAIN-math_llama-star-iter2
Viewer
•
Updated
•
3.15k
•
27
RLAIF/STAR-TRAIN-math_llama-star-iter1
Viewer
•
Updated
•
2.93k
•
33
RLAIF/math
Viewer
•
Updated
•
12.5k
•
274
•
1
RLAIF/iGSM-1M-retry0.5
Viewer
•
Updated
•
1.01M
•
33
RLAIF/iGSM-1M-retry0.0
Viewer
•
Updated
•
1.01M
•
23
RLAIF/iGSM-1M-retry0.6
Viewer
•
Updated
•
1.01M
•
33