ptrdvn commited on
Commit
dcf040b
·
verified ·
1 Parent(s): ac4252b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -25,6 +25,14 @@ Note that this model has a non-commerical license as we used the Command R and C
25
 
26
  We are currently working on a developing a commerically usable model, so stay tuned for that!
27
 
 
 
 
 
 
 
 
 
28
  # Model results
29
 
30
  We compare the MT-Bench scores across 6 languages for our 4 ORPO trained models, as well as some baselines:
 
25
 
26
  We are currently working on a developing a commerically usable model, so stay tuned for that!
27
 
28
+ # Model list
29
+
30
+ We have ORPO trained the following models using different proportions of the [lightblue/mitsu](https://huggingface.co/datasets/lightblue/mitsu) dataset:
31
+ * Trained on the top/bottom responses of all prompts in the dataset: [lightblue/suzume-llama-3-8B-multilingual-orpo-borda-full](https://huggingface.co/lightblue/suzume-llama-3-8B-multilingual-orpo-borda-full)
32
+ * Trained on the top/bottom responses of the prompts of the 75\% most consistently ranked responses in the dataset: [lightblue/suzume-llama-3-8B-multilingual-orpo-borda-top75](https://huggingface.co/lightblue/suzume-llama-3-8B-multilingual-orpo-borda-top75)
33
+ * Trained on the top/bottom responses of the prompts of the 50\% most consistently ranked responses in the dataset: [lightblue/suzume-llama-3-8B-multilingual-orpo-borda-half](https://huggingface.co/lightblue/suzume-llama-3-8B-multilingual-orpo-borda-half)
34
+ * Trained on the top/bottom responses of the prompts of the 25\% most consistently ranked responses in the dataset: [lightblue/suzume-llama-3-8B-multilingual-orpo-borda-top25](https://huggingface.co/lightblue/suzume-llama-3-8B-multilingual-orpo-borda-top25)
35
+
36
  # Model results
37
 
38
  We compare the MT-Bench scores across 6 languages for our 4 ORPO trained models, as well as some baselines: