Text Generation
Transformers
Safetensors
English
llama
alignment-handbook
axolotl
trl
dpo
sft
Generated from Trainer
conversational
text-generation-inference
Zhangchen Xu commited on
Commit
1051908
·
verified ·
1 Parent(s): 1b8b63a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -44,7 +44,7 @@ The overall performance is even better than the official Llama-3-8B-Instruct Mod
44
  - **Alpaca Eval 2 (vs Llama-3-8B-Instruct): 75.17 (LC), 78.20 (WR)**
45
  - **Arena Hard: 37.5**
46
  - **WildBench WB-Score: 42.7**
47
- - **Zero-Eval MMLU: 46.70**
48
 
49
  ## 🔥 Model Performance
50
 
@@ -63,7 +63,7 @@ We compare our Llama-3-8B-Magpie-Align with official and other **open-aligned LL
63
  +---------------------------------------------+------+------+------+----------+---------+-----------+-----------+------------+
64
  | NousResearch/Hermes-2-Pro-Llama-3-8B | 8.05 | 7.35 | 7.70 | 15.60 | 12.86 | 36.37 | 30.52 | 11.5 |
65
  +---------------------------------------------+------+------+------+----------+---------+-----------+-----------+------------+
66
- | allenai/llama-3-tulu-2-dpo-8b | 7.71 | 7.15 | 7.43 | 14.89 | 14.8 | 35.43 | 35.42 | 11.7 |
67
  +---------------------------------------------+------+------+------+----------+---------+-----------+-----------+------------+
68
  | cognitivecomputations/dolphin-2.9-llama3-8b | 7.97 | 6.98 | 7.47 | 12.50 | 8.79 | 32.67 | 22.80 | 8.2 |
69
  +---------------------------------------------+------+------+------+----------+---------+-----------+-----------+------------+
 
44
  - **Alpaca Eval 2 (vs Llama-3-8B-Instruct): 75.17 (LC), 78.20 (WR)**
45
  - **Arena Hard: 37.5**
46
  - **WildBench WB-Score: 42.7**
47
+ - **Zero-Eval GSM: 46.70**
48
 
49
  ## 🔥 Model Performance
50
 
 
63
  +---------------------------------------------+------+------+------+----------+---------+-----------+-----------+------------+
64
  | NousResearch/Hermes-2-Pro-Llama-3-8B | 8.05 | 7.35 | 7.70 | 15.60 | 12.86 | 36.37 | 30.52 | 11.5 |
65
  +---------------------------------------------+------+------+------+----------+---------+-----------+-----------+------------+
66
+ | allenai/llama-3-tulu-2-dpo-8b | 7.71 | 7.15 | 7.43 | 14.89 | 14.80 | 35.43 | 35.42 | 11.7 |
67
  +---------------------------------------------+------+------+------+----------+---------+-----------+-----------+------------+
68
  | cognitivecomputations/dolphin-2.9-llama3-8b | 7.97 | 6.98 | 7.47 | 12.50 | 8.79 | 32.67 | 22.80 | 8.2 |
69
  +---------------------------------------------+------+------+------+----------+---------+-----------+-----------+------------+