Update README.md
Browse files
README.md
CHANGED
@@ -74,7 +74,33 @@ Please generate a Advanced Dungeons & Dragons 2nd Edition character sheet for a
|
|
74 |
|
75 |
## Evals
|
76 |
|
77 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
78 |
|
79 |
## Future Plans
|
80 |
-
This model will be released on the whole Qwen-1.5 series.
|
|
|
|
|
|
74 |
|
75 |
## Evals
|
76 |
|
77 |
+
We evaluated checkpoint 1000 ((abacusai/Liberated-Qwen1.5-72B-c1000)[https://huggingface.co/abacusai/Liberated-Qwen1.5-72B-c1000]) from this training run against MT Bench:
|
78 |
+
|
79 |
+
```
|
80 |
+
########## First turn ##########
|
81 |
+
score
|
82 |
+
model turn
|
83 |
+
Liberated-Qwen-1.5-72b-ckpt1000 1 8.45000
|
84 |
+
Smaug-72B-v0.1 1 8.21250
|
85 |
+
|
86 |
+
########## Second turn ##########
|
87 |
+
score
|
88 |
+
model turn
|
89 |
+
Liberated-Qwen-1.5-72b-ckpt1000 2 7.65000
|
90 |
+
Smaug-72B-v0.1 2 7.20625
|
91 |
+
|
92 |
+
########## Average ##########
|
93 |
+
score
|
94 |
+
model
|
95 |
+
Liberated-Qwen-1.5-72b-ckpt1000 8.050000
|
96 |
+
Smaug-72B-v0.1 7.709375
|
97 |
+
```
|
98 |
+
|
99 |
+
Smaug has a higher leaderboard average score, but it appears that this new dataset does significantly help with instruction following.
|
100 |
+
|
101 |
+
The model does preserve good performance on MMLU = 77.13.
|
102 |
|
103 |
## Future Plans
|
104 |
+
This model will be released on the whole Qwen-1.5 series.
|
105 |
+
|
106 |
+
Future releases will also focus on mixing this dataset with the datasets used to train Smaug to combine properties of both models.
|