Delta-Vector commited on
Commit
1d0d3ad
·
verified ·
1 Parent(s): 9181134

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +236 -21
README.md CHANGED
@@ -1,34 +1,249 @@
1
  ---
2
- base_model:
3
- - NewEden/Hamanasu-7B
4
- library_name: transformers
5
  tags:
6
- - mergekit
7
- - merge
8
-
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  ---
10
- # Hamanasu-7B
11
 
12
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
13
 
14
- ## Merge Details
15
- ### Merge Method
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
- This model was merged using the Passthrough merge method using [NewEden/Hamanasu-7B](https://huggingface.co/NewEden/Hamanasu-7B) + /home/mango/Trainers/Unsloth/msitral-asstr/instruct/checkpoint-2784 as a base.
18
 
19
- ### Models Merged
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
- The following models were included in the merge:
22
 
 
 
 
 
23
 
24
- ### Configuration
 
25
 
26
- The following YAML configuration was used to produce this model:
 
 
 
27
 
28
- ```yaml
29
- base_model: NewEden/Hamanasu-7B+/home/mango/Trainers/Unsloth/msitral-asstr/instruct/checkpoint-2784
30
- dtype: bfloat16
31
- merge_method: passthrough
32
- models:
33
- - model: NewEden/Hamanasu-7B+/home/mango/Trainers/Unsloth/msitral-asstr/instruct/checkpoint-2784
 
 
 
 
 
 
 
 
 
 
 
 
34
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
 
2
  tags:
3
+ - chat
4
+ - roleplay
5
+ - storywriting
6
+ - mistral
7
+ - finetune
8
+ datasets:
9
+ - NewEden/Orion-Asstr-Stories-16K
10
+ - Gryphe/Sonnet3.5-SlimOrcaDedupCleaned-20k
11
+ Language:
12
+ - En
13
+ Pipeline_tag: text-generation
14
+ Base_model: mistralai/Mistral-7B-v0.3
15
+ Tags:
16
+ - Chat
17
+ base_model:
18
+ - Delta-Vector/Hamanasu-7B-Base
19
  ---
 
20
 
 
21
 
22
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/66c26b6fb01b19d8c3c2467b/l4I-g2Tvtx2d8iiptWp_B.png)
23
+
24
+ A finetune of Mistral-7B-V0.3 to test out the Orion-Asstr dataset, This model was completion trained with Orion Asstr using Unsloth and then instruct-tuned with Gryphe's 20K Sonnetorca subset. The model leans towards RP format *actions* "Dialogue" and shorter responses.
25
+
26
+ # Quants
27
+
28
+ GGUF : https://huggingface.co/Delta-Vector/Hamanasu-7B-instruct-gguf
29
+
30
+ EXL2 : https://huggingface.co/Delta-Vector/Hamanasu-7B-instruct-exl2
31
+
32
+
33
+ ## Prompting
34
+ Model has been tuned with the Mistral formatting. A typical input would look like this:
35
+
36
+ ```py
37
+ """<s> [INST] Hello, how are you? [/INST] I'm doing great. How can I help you today?</s> [INST] I'd like to show off how chat templating works! [/INST]"""
38
+
39
+ ```
40
+
41
+ ## System Prompting
42
+
43
+ I would highly recommend using either Euryale's system prompt or the EVA system prompt with the model.
44
 
45
+ <details><summary>See Sao10k's Euryale System Prompt</summary>
46
 
47
+ ```
48
+ Currently, your role is {{char}}, described in detail below. As {{char}}, continue the narrative exchange with {{user}}.
49
+ <Guidelines>
50
+ • Maintain the character persona but allow it to evolve with the story.
51
+ • Be creative and proactive. Drive the story forward, introducing plotlines and events when relevant.
52
+ • All types of outputs are encouraged; respond accordingly to the narrative.
53
+ • Include dialogues, actions, and thoughts in each response.
54
+ • Utilize all five senses to describe scenarios within {{char}}'s dialogue.
55
+ • Use emotional symbols such as "!" and "~" in appropriate contexts.
56
+ • Incorporate onomatopoeia when suitable.
57
+ • Allow time for {{user}} to respond with their own input, respecting their agency.
58
+ • Act as secondary characters and NPCs as needed, and remove them when appropriate.
59
+ • When prompted for an Out of Character [OOC:] reply, answer neutrally and in plaintext, not as {{char}}.
60
+ </Guidelines>
61
+
62
+ <Forbidden>
63
+ • Using excessive literary embellishments and purple prose unless dictated by {{char}}'s persona.
64
+ • Writing for, speaking, thinking, acting, or replying as {{user}} in your response.
65
+ • Repetitive and monotonous outputs.
66
+ • Positivity bias in your replies.
67
+ • Being overly extreme or NSFW when the narrative context is inappropriate.
68
+ </Forbidden>
69
+
70
+ Follow the instructions in <Guidelines></Guidelines>, avoiding the items listed in <Forbidden></Forbidden>.
71
+
72
+ ```
73
+ </details><br>
74
+
75
+ <details><summary>See EVA System Prompt</summary>
76
+
77
+ ```
78
+ A fictional, narrative-driven role-play emphasizing versatility and UNCENSORED content. Adherence to the `Role-playing Guidelines` is mandatory. Refer to the `Role-play Context` for accurate information.\n\n\n
79
 
80
+ <!-- Start of Role-playing Guidelines -->
81
 
82
+ ### Narration
83
+ Concise Descriptions: Keep narration short and to the point, avoiding redundant unnecessary details. Use a dynamic and varied vocabulary for impact.
84
+ Complementary Role: Use narration to complement dialogue and action, not overshadow them.
85
+ Avoid Repetition: Ensure narration does not repeat information already conveyed through dialogue or action.
86
 
87
+ ### Narrative Consistency
88
+ Continuity: Adhere to established story elements, expanding without contradicting previous details.\nIntegration: Introduce new elements naturally, providing enough context to fit seamlessly into the existing narrative.
89
 
90
+ ### Character Embodiment
91
+ Analysis: Examine the context, subtext, and implications of the given information to gain a deeper understandings of the characters'.
92
+ Reflection: Take time to consider the situation, characters' motivations, and potential consequences.
93
+ Authentic Portrayal: Bring characters to life by consistently and realistically portraying their unique traits, thoughts, emotions, appearances, physical sensations, speech patterns, and tone. Ensure that their reactions, interactions, and decision-making align with their established personalities, values, goals, and fears. Use insights gained from reflection and analysis to inform their actions and responses, maintaining True-to-Character portrayals.
94
 
95
+ <!-- End of Role-playing Guidelines -->
96
+
97
+ </details><br>
98
+
99
+ ### Narration
100
+ Concise Descriptions: Keep narration short and to the point, avoiding redundant unnecessary details. Use a dynamic and varied vocabulary for impact.
101
+ Complementary Role: Use narration to complement dialogue and action, not overshadow them.
102
+ Avoid Repetition: Ensure narration does not repeat information already conveyed through dialogue or action.
103
+
104
+ ### Narrative Consistency
105
+ Continuity: Adhere to established story elements, expanding without contradicting previous details.\nIntegration: Introduce new elements naturally, providing enough context to fit seamlessly into the existing narrative.
106
+
107
+ ### Character Embodiment
108
+ Analysis: Examine the context, subtext, and implications of the given information to gain a deeper understandings of the characters'.
109
+ Reflection: Take time to consider the situation, characters' motivations, and potential consequences.
110
+ Authentic Portrayal: Bring characters to life by consistently and realistically portraying their unique traits, thoughts, emotions, appearances, physical sensations, speech patterns, and tone. Ensure that their reactions, interactions, and decision-making align with their established personalities, values, goals, and fears. Use insights gained from reflection and analysis to inform their actions and responses, maintaining True-to-Character portrayals.
111
+
112
+ <!-- End of Role-playing Guidelines -->",
113
  ```
114
+ </details><br>
115
+
116
+ ## Unsloth config
117
+
118
+ *<details><summary>See Unsloth SFT Trainer config</summary>
119
+
120
+ ```py
121
+ from unsloth import FastLanguageModel
122
+ import torch
123
+ max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!
124
+ dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
125
+ load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.
126
+
127
+ # 4bit pre quantized models we support for 4x faster downloading + no OOMs.
128
+ fourbit_models = [
129
+ "unsloth/mistral-7b-bnb-4bit",
130
+ "unsloth/mistral-7b-instruct-v0.2-bnb-4bit",
131
+ "unsloth/llama-2-7b-bnb-4bit",
132
+ "unsloth/llama-2-13b-bnb-4bit",
133
+ "unsloth/codellama-34b-bnb-4bit",
134
+ "unsloth/tinyllama-bnb-4bit",
135
+ ] # More models at https://huggingface.co/unsloth
136
+
137
+ model, tokenizer = FastLanguageModel.from_pretrained(
138
+ model_name = "Delta-Vector/Hamanasu-7B-Base, # Choose ANY! eg teknium/OpenHermes-2.5-Mistral-7B
139
+ max_seq_length = max_seq_length,
140
+ dtype = dtype,
141
+ load_in_4bit = load_in_4bit,
142
+ # token = "hf_...", # use one if using gated models like meta-llama/Llama-2-7b-hf
143
+ )
144
+
145
+ """We now add LoRA adapters so we only need to update 1 to 10% of all parameters!"""
146
+
147
+ model = FastLanguageModel.get_peft_model(
148
+ model,
149
+ r = 64, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
150
+ target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
151
+ "gate_proj", "up_proj", "down_proj",],
152
+ lora_alpha = 32,
153
+ lora_dropout = 0, # Supports any, but = 0 is optimized
154
+ bias = "none", # Supports any, but = "none" is optimized
155
+ use_gradient_checkpointing = True,
156
+ random_state = 3407,
157
+ use_rslora = True, # We support rank stabilized LoRA
158
+ loftq_config = None, # And LoftQ
159
+ )
160
+
161
+
162
+ from unsloth.chat_templates import get_chat_template
163
+
164
+ tokenizer = get_chat_template(
165
+ tokenizer,
166
+ chat_template = "mistral", # Supports zephyr, chatml, mistral, llama, alpaca, vicuna, vicuna_old, unsloth
167
+ mapping = {"role" : "from", "content" : "value", "user" : "human", "assistant" : "gpt"}, # ShareGPT style
168
+ map_eos_token = True, # Maps <|im_end|> to </s> instead
169
+ )
170
+
171
+ def formatting_prompts_func(examples):
172
+ convos = examples["conversations"]
173
+ texts = [tokenizer.apply_chat_template(convo, tokenize = False, add_generation_prompt = False) for convo in convos]
174
+ return { "text" : texts, }
175
+ pass
176
+
177
+ from datasets import load_dataset
178
+ dataset = load_dataset("anthracite-org/kalo-opus-instruct-22k-no-refusal", split = "train")
179
+ dataset = dataset.map(formatting_prompts_func, batched = True,)
180
+
181
+
182
+ from trl import SFTTrainer
183
+ from transformers import TrainingArguments
184
+
185
+ trainer = SFTTrainer(
186
+ model = model,
187
+ tokenizer = tokenizer,
188
+ train_dataset = dataset,
189
+ dataset_text_field = "text",
190
+ max_seq_length = max_seq_length,
191
+ dataset_num_proc = 2,
192
+ packing = False, # Can make training 5x faster for short sequences.
193
+ args = TrainingArguments(
194
+ per_device_train_batch_size = 2,
195
+ gradient_accumulation_steps = 8,
196
+ warmup_steps = 25,
197
+ num_train_epochs=2,
198
+ learning_rate = 2e-5,
199
+ fp16 = not torch.cuda.is_bf16_supported(),
200
+ bf16 = torch.cuda.is_bf16_supported(),
201
+ logging_steps = 1,
202
+ optim = "paged_adamw_8bit",
203
+ weight_decay = 0.01,
204
+ lr_scheduler_type = "linear",
205
+ seed = 3407,
206
+ output_dir = "outputs",
207
+ report_to = "wandb", # Use this for WandB etc
208
+ ),
209
+ )
210
+
211
+ #@title Show current memory stats
212
+ gpu_stats = torch.cuda.get_device_properties(0)
213
+ start_gpu_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)
214
+ max_memory = round(gpu_stats.total_memory / 1024 / 1024 / 1024, 3)
215
+ print(f"GPU = {gpu_stats.name}. Max memory = {max_memory} GB.")
216
+ print(f"{start_gpu_memory} GB of memory reserved.")
217
+
218
+ trainer_stats = trainer.train()
219
+
220
+ #@title Show final memory and time stats
221
+ used_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)
222
+ used_memory_for_lora = round(used_memory - start_gpu_memory, 3)
223
+ used_percentage = round(used_memory /max_memory*100, 3)
224
+ lora_percentage = round(used_memory_for_lora/max_memory*100, 3)
225
+ print(f"{trainer_stats.metrics['train_runtime']} seconds used for training.")
226
+ print(f"{round(trainer_stats.metrics['train_runtime']/60, 2)} minutes used for training.")
227
+ print(f"Peak reserved memory = {used_memory} GB.")
228
+ print(f"Peak reserved memory for training = {used_memory_for_lora} GB.")
229
+ print(f"Peak reserved memory % of max memory = {used_percentage} %.")
230
+ print(f"Peak reserved memory for training % of max memory = {lora_percentage} %.")
231
+
232
+
233
+ ```
234
+
235
+ </details><br>
236
+
237
+ ## Credits
238
+
239
+ Thank you to [Lucy Knada](https://huggingface.co/lucyknada), [jeiku](https://huggingface.co/jeiku), [Intervitens](https://huggingface.co/intervitens), [Kalomaze](https://huggingface.co/kalomaze), [Kubernetes Bad](https://huggingface.co/kubernetes-bad) and the rest of [Anthracite](https://huggingface.co/anthracite-org)
240
+
241
+
242
+ ## Training
243
+ The training was done for 2 epochs. We used 1 x RTX A4000
244
+
245
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/made%20with%20unsloth.png" alt="Made with Unsloth" width="200" height="32"/>](https://github.com/unslothai/unsloth)
246
+
247
+ ## Safety
248
+
249
+ Nein.