Verah commited on
Commit
b18689e
·
verified ·
1 Parent(s): 6b7b97d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -0
README.md CHANGED
@@ -1,3 +1,35 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ This is a linear model merge of:
5
+
6
+ 60%
7
+ https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2
8
+
9
+ 40%
10
+ https://huggingface.co/stabilityai/japanese-stablelm-instruct-gamma-7b
11
+
12
+
13
+ ## Evaluation
14
+
15
+ Tested on correct en-jp translation identification on the first 10k rows of https://huggingface.co/datasets/Verah/tatoeba_dedupe_en-jp_2024-March-01
16
+
17
+ Desired behaviour is to not accept any translation when we deliberaly test incorrect pairings from the dataset, and to not reject any translation when shown only correctly paired examples.
18
+
19
+
20
+ |Model | False Admissions| False Rejections |
21
+ |----------------|------------------|--------------------|
22
+ |Mistral Instruct | 41 | 600 |
23
+ |(This Model) | 13 | 1839 |
24
+ |JP Stable LM Gamma | 9679 | 138 |
25
+ |Hermes2DPO | 20 | 598 |
26
+
27
+
28
+ I made the test harder by concatenating 3 paired sentences together, in the false admissions case 1 out of those 3 was incorrectly paired.
29
+
30
+ |Model| False Admissions | False* Rejections|
31
+ |-----|------------------|-----------------|
32
+ |(This Model) | 89 | 5508|
33
+ |Hermes2DPO | 537| 1458|
34
+
35
+ This model also wanted to reject many "correct" translations, however 3 unrelated sentences back to back isn't a very correct thing to be doing, either.