license: apache-2.0 | |
### Qwen2.5-7B-Huatuo-difficulty-SFT | |
- Base Model: [Qwen/Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B) | |
- Training Epoches: 3 | |
- Training Objective: SFT + RL | |
- Training Data: | |
- SFT Data: [ReasoningEval/Huatuo-SFT-difficulty](https://huggingface.co/datasets/ReasoningEval/Huatuo-SFT-difficulty) | |
- RL Data: [ReasoningEval/Huatuo-RL](https://huggingface.co/datasets/ReasoningEval/Huatuo-RL) |