Safetensors
English
llama
alignment-handbook
trl
dpo
Generated from Trainer
Llama-3.1-8B-Magpie-Align-v0.2 / train_results.json
Zhangchen Xu
Model save
d8eec6f verified
raw
history blame contribute delete
234 Bytes
{
"epoch": 0.9992652461425422,
"total_flos": 0.0,
"train_loss": 0.41019999388775796,
"train_runtime": 41758.0505,
"train_samples": 97988,
"train_samples_per_second": 2.347,
"train_steps_per_second": 0.018
}