Suparious commited on
Commit
1870c62
·
verified ·
1 Parent(s): 425b7fa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -0
README.md CHANGED
@@ -1,12 +1,28 @@
1
  ---
 
 
2
  library_name: transformers
3
  tags:
4
  - 4-bit
5
  - AWQ
6
  - text-generation
7
  - autotrain_compatible
 
 
8
  - endpoints_compatible
9
  pipeline_tag: text-generation
 
 
 
 
 
 
 
 
 
 
 
 
10
  inference: false
11
  quantized_by: Suparious
12
  ---
@@ -15,7 +31,26 @@ quantized_by: Suparious
15
  - Model creator: [cognitivecomputations](https://huggingface.co/cognitivecomputations)
16
  - Original model: [dolphin-2.9.1-llama-3-8b](https://huggingface.co/cognitivecomputations/dolphin-2.9.1-llama-3-8b)
17
 
 
18
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
  ## How to use
21
 
 
1
  ---
2
+ license: other
3
+ base_model: meta-llama/Meta-Llama-3-8B
4
  library_name: transformers
5
  tags:
6
  - 4-bit
7
  - AWQ
8
  - text-generation
9
  - autotrain_compatible
10
+ - generated_from_trainer
11
+ - axolotl
12
  - endpoints_compatible
13
  pipeline_tag: text-generation
14
+ model-index:
15
+ - name: out
16
+ results: []
17
+ datasets:
18
+ - cognitivecomputations/Dolphin-2.9
19
+ - teknium/OpenHermes-2.5
20
+ - m-a-p/CodeFeedback-Filtered-Instruction
21
+ - cognitivecomputations/dolphin-coder
22
+ - cognitivecomputations/samantha-data
23
+ - microsoft/orca-math-word-problems-200k
24
+ - Locutusque/function-calling-chatml
25
+ - internlm/Agent-FLAN
26
  inference: false
27
  quantized_by: Suparious
28
  ---
 
31
  - Model creator: [cognitivecomputations](https://huggingface.co/cognitivecomputations)
32
  - Original model: [dolphin-2.9.1-llama-3-8b](https://huggingface.co/cognitivecomputations/dolphin-2.9.1-llama-3-8b)
33
 
34
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/ldkN1J0WIDQwU4vutGYiD.png" width="600" />
35
 
36
+ ## Model Summary
37
+
38
+ Dolphin 2.9.1 Llama 3 8b 🐬
39
+
40
+ Curated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations
41
+
42
+ Discord: https://discord.gg/8fbBeC7ZGx
43
+
44
+ We have retrained our LLama-3-8b fine tune to address behavioral issues in the initial 2.9 dataset. Specifically, Systemchat was causing the model to be *too* reliant on the system prompt. Additionally, it had an occasional quirk that would cause the model to overly reference the system prompt. We also found generation length was at times not sufficient for any given task. We identified the culprit as Ultrachat. Accounting for these concerns, we removed systemchat and ultrachat from the dataset. It is otherwise identical to dolphin-2.9.
45
+
46
+ Our appreciation for the sponsors of Dolphin 2.9.1:
47
+ - [Crusoe Cloud](https://crusoe.ai/) - provided excellent on-demand 8xL40S node
48
+
49
+ This model is based on Llama-3-8b, and is governed by [META LLAMA 3 COMMUNITY LICENSE AGREEMENT](LICENSE)
50
+
51
+ The base model has 8k context, and the full-weight fine-tuning was with 4k sequence length.
52
+
53
+ It took 1.5 days on an 8x L40S provided by Crusoe Cloud
54
 
55
  ## How to use
56