error577 commited on
Commit
fc1796f
·
verified ·
1 Parent(s): 49ce8e0

End of training

Browse files
Files changed (3) hide show
  1. README.md +12 -10
  2. adapter_model.bin +1 -1
  3. adapter_model.safetensors +1 -1
README.md CHANGED
@@ -50,19 +50,19 @@ flash_attention: true
50
  fp16: false
51
  fsdp: null
52
  fsdp_config: null
53
- gradient_accumulation_steps: 1
54
  gradient_checkpointing: true
55
  group_by_length: false
56
  hub_model_id: error577/ed864f34-7a77-4fe8-98e8-0903340e71dd
57
  hub_repo: null
58
  hub_strategy: checkpoint
59
  hub_token: null
60
- learning_rate: 0.0002
61
  load_in_4bit: false
62
  load_in_8bit: true
63
  local_rank: null
64
  logging_steps: 1
65
- lora_alpha: 64
66
  lora_dropout: 0.0
67
  lora_fan_in_fan_out: null
68
  lora_model_dir: null
@@ -107,7 +107,7 @@ xformers_attention: null
107
 
108
  This model is a fine-tuned version of [unsloth/tinyllama](https://huggingface.co/unsloth/tinyllama) on the None dataset.
109
  It achieves the following results on the evaluation set:
110
- - Loss: 1.9738
111
 
112
  ## Model description
113
 
@@ -126,10 +126,12 @@ More information needed
126
  ### Training hyperparameters
127
 
128
  The following hyperparameters were used during training:
129
- - learning_rate: 0.0002
130
  - train_batch_size: 8
131
  - eval_batch_size: 8
132
  - seed: 42
 
 
133
  - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
134
  - lr_scheduler_type: cosine
135
  - lr_scheduler_warmup_steps: 30
@@ -139,11 +141,11 @@ The following hyperparameters were used during training:
139
 
140
  | Training Loss | Epoch | Step | Validation Loss |
141
  |:-------------:|:------:|:----:|:---------------:|
142
- | 1.9255 | 0.0002 | 1 | 1.9194 |
143
- | 1.9687 | 0.0306 | 200 | 1.9738 |
144
- | 2.0093 | 0.0611 | 400 | 1.9738 |
145
- | 2.0179 | 0.0917 | 600 | 1.9738 |
146
- | 1.8811 | 0.1223 | 800 | 1.9738 |
147
 
148
 
149
  ### Framework versions
 
50
  fp16: false
51
  fsdp: null
52
  fsdp_config: null
53
+ gradient_accumulation_steps: 4
54
  gradient_checkpointing: true
55
  group_by_length: false
56
  hub_model_id: error577/ed864f34-7a77-4fe8-98e8-0903340e71dd
57
  hub_repo: null
58
  hub_strategy: checkpoint
59
  hub_token: null
60
+ learning_rate: 0.00000002
61
  load_in_4bit: false
62
  load_in_8bit: true
63
  local_rank: null
64
  logging_steps: 1
65
+ lora_alpha: 16
66
  lora_dropout: 0.0
67
  lora_fan_in_fan_out: null
68
  lora_model_dir: null
 
107
 
108
  This model is a fine-tuned version of [unsloth/tinyllama](https://huggingface.co/unsloth/tinyllama) on the None dataset.
109
  It achieves the following results on the evaluation set:
110
+ - Loss: 1.9212
111
 
112
  ## Model description
113
 
 
126
  ### Training hyperparameters
127
 
128
  The following hyperparameters were used during training:
129
+ - learning_rate: 2e-08
130
  - train_batch_size: 8
131
  - eval_batch_size: 8
132
  - seed: 42
133
+ - gradient_accumulation_steps: 4
134
+ - total_train_batch_size: 32
135
  - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
136
  - lr_scheduler_type: cosine
137
  - lr_scheduler_warmup_steps: 30
 
141
 
142
  | Training Loss | Epoch | Step | Validation Loss |
143
  |:-------------:|:------:|:----:|:---------------:|
144
+ | 1.9788 | 0.0006 | 1 | 1.9194 |
145
+ | 1.9983 | 0.1223 | 200 | 1.9197 |
146
+ | 1.8519 | 0.2446 | 400 | 1.9200 |
147
+ | 1.9556 | 0.3669 | 600 | 1.9202 |
148
+ | 1.8386 | 0.4891 | 800 | 1.9212 |
149
 
150
 
151
  ### Framework versions
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3711ce1dcd486f12e01f0d432e856f7c41657e16284034c53638290e571515de
3
  size 101036698
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6a04d1c08472b6d1b8e07ffe378a084feb20442c77f4900286ee4db9b4bd94f1
3
  size 101036698
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fe90e90343251a89f8ed6d19a7f0b201f6a1c075b7d1cae726a534b0480e2c74
3
  size 100966336
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ef8b36a6665e6e85d4b7cefcfca344ad529964178a10485e2ff9e300343a84ef
3
  size 100966336