Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,55 @@
|
|
1 |
-
---
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- sv
|
4 |
+
license: llama3.1
|
5 |
+
library_name: transformers
|
6 |
+
tags:
|
7 |
+
- unsloth
|
8 |
+
datasets:
|
9 |
+
- neph1/bellman-7b-finetune
|
10 |
+
- neph1/codefeedback-swedish
|
11 |
+
---
|
12 |
+
|
13 |
+
# Model Card for Bellman
|
14 |
+
|
15 |
+
This version of bellman is finetuned from llama-3.1-instruct-8b.
|
16 |
+
It's finetuned for prompt question answering, based on a dataset created from
|
17 |
+
Swedish wikipedia, with a lot of Sweden-centric questions.
|
18 |
+
New from previous versions is questions from a translated code-feedback dataset, as well as a number of stories. It's not great at generating stories,
|
19 |
+
but better than previosly.
|
20 |
+
|
21 |
+
Please note, the HuggingFace inference api is probably trying to load the adapter (lora) which isn't going to work.
|
22 |
+
|
23 |
+

|
24 |
+
|
25 |
+
## Model Details
|
26 |
+
|
27 |
+
Training run on 240606:
|
28 |
+
|
29 |
+
Step Training Loss Validation Loss<br>
|
30 |
+
25 1.352200 1.034565<br>
|
31 |
+
50 1.033600 1.009348<br>
|
32 |
+
75 1.022400 0.996665<br>
|
33 |
+
100 1.002900 0.988050<br>
|
34 |
+
125 1.014600 0.981633<br>
|
35 |
+
150 1.006300 0.975584<br>
|
36 |
+
175 0.988800 0.970966<br>
|
37 |
+
200 0.985300 0.967037<br>
|
38 |
+
225 0.992400 0.964120<br>
|
39 |
+
250 0.950000 0.962472<br>
|
40 |
+
275 0.931000 0.960848<br>
|
41 |
+
300 0.932000 0.958946 <-- picked checkpoint <br>
|
42 |
+
|
43 |
+
### Model Description
|
44 |
+
|
45 |
+
|
46 |
+
- **Developed by:** Me
|
47 |
+
- **Funded by:** Me
|
48 |
+
- **Model type:** Instruct
|
49 |
+
- **Language(s) (NLP):** Swedish
|
50 |
+
- **License:** llama-3
|
51 |
+
- **Finetuned from model:** Llama3.1 Instruct 8b
|
52 |
+
|
53 |
+
## Model Card Contact
|
54 |
+
|
55 |