Phi4-ThinkMode

This is a fine-tuned version of unsloth/Phi-4 with enhanced reasoning capabilities using GRPO (1000 step) on the dataset gsm8k

Model details

  • Base model: unsloth/Phi-4
  • Fine-tuning: 16-bit precision
  • Use case: Improved reasoning and thinking mode
Downloads last month
9
Safetensors
Model size
14.7B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ykarout/Phi4-ThinkMode-fp16

Base model

microsoft/phi-4
Finetuned
unsloth/phi-4
Finetuned
(78)
this model
Quantizations
1 model

Dataset used to train ykarout/Phi4-ThinkMode-fp16