metadata
license: apache-2.0
Qwen2.5-7B-Huatuo-difficulty-SFT
Base Model: Qwen/Qwen2.5-7B
Training Epoches: 3
Training Objective: SFT + RL
Training Data:
- SFT Data: ReasoningEval/Huatuo-SFT-difficulty
- RL Data: ReasoningEval/Huatuo-RL
license: apache-2.0
Base Model: Qwen/Qwen2.5-7B
Training Epoches: 3
Training Objective: SFT + RL
Training Data: