Update README.md
Browse files
README.md
CHANGED
@@ -7,6 +7,20 @@ model-index:
|
|
7 |
- name: outputs/lr-8e6
|
8 |
results: []
|
9 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
|
11 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
12 |
should probably proofread and complete it, then remove this comment. -->
|
|
|
7 |
- name: outputs/lr-8e6
|
8 |
results: []
|
9 |
---
|
10 |
+
I ran the tests for 2 runs just to try to lower variance. These are all using temp 0.2, min_p 0.1, freq penalty 0.5
|
11 |
+
|
12 |
+
| Model | AVG Score | ELYZA100 | JA MT-Bench | Rakuda | Tengu-Bench | JA Char % |
|
13 |
+
|-----------------------------|-----------|----------|-------------|--------|-------------|-----------|
|
14 |
+
| shisa-v1-llama3-8b.lr-2e4 | 3.97 | 4.60 | 4.54 | 3.33 | 3.42 | 92.42% |
|
15 |
+
| shisa-v1-llama3-8b.lr-5e5 | 5.73 | 6.28 | 6.45 | 5.37 | 4.81 | 90.93% |
|
16 |
+
| shisa-v1-llama3-8b (2e5 avg)| 6.33 | 6.51 | 6.66 | 6.68 | 5.48 | 91.51% |
|
17 |
+
| shisa-v1-llama3-8b.8e6 | 6.59 | 6.67 | 6.95 | 7.05 | 5.68 | 91.30% |
|
18 |
+
| shisa-v1-llama3-8b.5e6 | 6.42 | 6.33 | 6.76 | 7.15 | 5.45 | 91.56% |
|
19 |
+
| shisa-v1-llama3-8b.2e6 | 6.31 | 6.26 | 6.88 | 6.73 | 5.38 | 92.00% |
|
20 |
+
* The 2e-4 and 5e-5 are definitely overtrained and perform significantly worse.
|
21 |
+
* 2e-5 is on the edge since weightwacher shows the embed as slightly overtrained for 2e-5, but NEFTune version is not
|
22 |
+
* 8e-6 performs the best, and 5e-6 also performed slightly better than 2e-5
|
23 |
+
|
24 |
|
25 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
26 |
should probably proofread and complete it, then remove this comment. -->
|