stefan-it's picture
Upload folder using huggingface_hub
49df7ce
2023-10-17 11:35:45,552 ----------------------------------------------------------------------------------------------------
2023-10-17 11:35:45,554 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 11:35:45,554 ----------------------------------------------------------------------------------------------------
2023-10-17 11:35:45,555 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-17 11:35:45,555 ----------------------------------------------------------------------------------------------------
2023-10-17 11:35:45,555 Train: 6183 sentences
2023-10-17 11:35:45,555 (train_with_dev=False, train_with_test=False)
2023-10-17 11:35:45,555 ----------------------------------------------------------------------------------------------------
2023-10-17 11:35:45,555 Training Params:
2023-10-17 11:35:45,555 - learning_rate: "5e-05"
2023-10-17 11:35:45,555 - mini_batch_size: "4"
2023-10-17 11:35:45,555 - max_epochs: "10"
2023-10-17 11:35:45,555 - shuffle: "True"
2023-10-17 11:35:45,555 ----------------------------------------------------------------------------------------------------
2023-10-17 11:35:45,555 Plugins:
2023-10-17 11:35:45,555 - TensorboardLogger
2023-10-17 11:35:45,556 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 11:35:45,556 ----------------------------------------------------------------------------------------------------
2023-10-17 11:35:45,556 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 11:35:45,556 - metric: "('micro avg', 'f1-score')"
2023-10-17 11:35:45,556 ----------------------------------------------------------------------------------------------------
2023-10-17 11:35:45,556 Computation:
2023-10-17 11:35:45,556 - compute on device: cuda:0
2023-10-17 11:35:45,556 - embedding storage: none
2023-10-17 11:35:45,556 ----------------------------------------------------------------------------------------------------
2023-10-17 11:35:45,556 Model training base path: "hmbench-topres19th/en-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-17 11:35:45,556 ----------------------------------------------------------------------------------------------------
2023-10-17 11:35:45,556 ----------------------------------------------------------------------------------------------------
2023-10-17 11:35:45,556 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 11:35:57,546 epoch 1 - iter 154/1546 - loss 1.83885851 - time (sec): 11.99 - samples/sec: 1026.62 - lr: 0.000005 - momentum: 0.000000
2023-10-17 11:36:09,631 epoch 1 - iter 308/1546 - loss 1.03146160 - time (sec): 24.07 - samples/sec: 1023.52 - lr: 0.000010 - momentum: 0.000000
2023-10-17 11:36:21,432 epoch 1 - iter 462/1546 - loss 0.72698415 - time (sec): 35.87 - samples/sec: 1043.63 - lr: 0.000015 - momentum: 0.000000
2023-10-17 11:36:33,820 epoch 1 - iter 616/1546 - loss 0.56735648 - time (sec): 48.26 - samples/sec: 1038.81 - lr: 0.000020 - momentum: 0.000000
2023-10-17 11:36:46,043 epoch 1 - iter 770/1546 - loss 0.47358265 - time (sec): 60.49 - samples/sec: 1033.96 - lr: 0.000025 - momentum: 0.000000
2023-10-17 11:36:58,100 epoch 1 - iter 924/1546 - loss 0.42155746 - time (sec): 72.54 - samples/sec: 1021.78 - lr: 0.000030 - momentum: 0.000000
2023-10-17 11:37:09,969 epoch 1 - iter 1078/1546 - loss 0.38135834 - time (sec): 84.41 - samples/sec: 1013.51 - lr: 0.000035 - momentum: 0.000000
2023-10-17 11:37:21,956 epoch 1 - iter 1232/1546 - loss 0.34787118 - time (sec): 96.40 - samples/sec: 1017.93 - lr: 0.000040 - momentum: 0.000000
2023-10-17 11:37:34,804 epoch 1 - iter 1386/1546 - loss 0.32052669 - time (sec): 109.25 - samples/sec: 1017.93 - lr: 0.000045 - momentum: 0.000000
2023-10-17 11:37:48,360 epoch 1 - iter 1540/1546 - loss 0.30029350 - time (sec): 122.80 - samples/sec: 1007.40 - lr: 0.000050 - momentum: 0.000000
2023-10-17 11:37:48,893 ----------------------------------------------------------------------------------------------------
2023-10-17 11:37:48,893 EPOCH 1 done: loss 0.2991 - lr: 0.000050
2023-10-17 11:37:51,313 DEV : loss 0.07589336484670639 - f1-score (micro avg) 0.7239
2023-10-17 11:37:51,349 saving best model
2023-10-17 11:37:51,942 ----------------------------------------------------------------------------------------------------
2023-10-17 11:38:04,647 epoch 2 - iter 154/1546 - loss 0.09620320 - time (sec): 12.70 - samples/sec: 933.49 - lr: 0.000049 - momentum: 0.000000
2023-10-17 11:38:16,990 epoch 2 - iter 308/1546 - loss 0.09650626 - time (sec): 25.04 - samples/sec: 958.28 - lr: 0.000049 - momentum: 0.000000
2023-10-17 11:38:28,988 epoch 2 - iter 462/1546 - loss 0.09623961 - time (sec): 37.04 - samples/sec: 983.56 - lr: 0.000048 - momentum: 0.000000
2023-10-17 11:38:43,408 epoch 2 - iter 616/1546 - loss 0.09668446 - time (sec): 51.46 - samples/sec: 961.53 - lr: 0.000048 - momentum: 0.000000
2023-10-17 11:38:55,786 epoch 2 - iter 770/1546 - loss 0.09355811 - time (sec): 63.84 - samples/sec: 963.34 - lr: 0.000047 - momentum: 0.000000
2023-10-17 11:39:08,327 epoch 2 - iter 924/1546 - loss 0.09142346 - time (sec): 76.38 - samples/sec: 975.58 - lr: 0.000047 - momentum: 0.000000
2023-10-17 11:39:21,196 epoch 2 - iter 1078/1546 - loss 0.09177734 - time (sec): 89.25 - samples/sec: 969.55 - lr: 0.000046 - momentum: 0.000000
2023-10-17 11:39:33,455 epoch 2 - iter 1232/1546 - loss 0.08997092 - time (sec): 101.51 - samples/sec: 966.64 - lr: 0.000046 - momentum: 0.000000
2023-10-17 11:39:45,555 epoch 2 - iter 1386/1546 - loss 0.08993457 - time (sec): 113.61 - samples/sec: 986.13 - lr: 0.000045 - momentum: 0.000000
2023-10-17 11:39:57,860 epoch 2 - iter 1540/1546 - loss 0.08963720 - time (sec): 125.92 - samples/sec: 983.69 - lr: 0.000044 - momentum: 0.000000
2023-10-17 11:39:58,318 ----------------------------------------------------------------------------------------------------
2023-10-17 11:39:58,318 EPOCH 2 done: loss 0.0903 - lr: 0.000044
2023-10-17 11:40:01,334 DEV : loss 0.0674557313323021 - f1-score (micro avg) 0.7649
2023-10-17 11:40:01,364 saving best model
2023-10-17 11:40:02,818 ----------------------------------------------------------------------------------------------------
2023-10-17 11:40:14,771 epoch 3 - iter 154/1546 - loss 0.05524478 - time (sec): 11.94 - samples/sec: 994.46 - lr: 0.000044 - momentum: 0.000000
2023-10-17 11:40:26,806 epoch 3 - iter 308/1546 - loss 0.06017580 - time (sec): 23.98 - samples/sec: 961.64 - lr: 0.000043 - momentum: 0.000000
2023-10-17 11:40:39,504 epoch 3 - iter 462/1546 - loss 0.05964897 - time (sec): 36.67 - samples/sec: 968.74 - lr: 0.000043 - momentum: 0.000000
2023-10-17 11:40:51,989 epoch 3 - iter 616/1546 - loss 0.06336051 - time (sec): 49.16 - samples/sec: 985.98 - lr: 0.000042 - momentum: 0.000000
2023-10-17 11:41:04,948 epoch 3 - iter 770/1546 - loss 0.06367074 - time (sec): 62.12 - samples/sec: 986.27 - lr: 0.000042 - momentum: 0.000000
2023-10-17 11:41:17,312 epoch 3 - iter 924/1546 - loss 0.06217703 - time (sec): 74.48 - samples/sec: 989.91 - lr: 0.000041 - momentum: 0.000000
2023-10-17 11:41:29,545 epoch 3 - iter 1078/1546 - loss 0.06292626 - time (sec): 86.72 - samples/sec: 993.43 - lr: 0.000041 - momentum: 0.000000
2023-10-17 11:41:42,319 epoch 3 - iter 1232/1546 - loss 0.06342969 - time (sec): 99.49 - samples/sec: 990.50 - lr: 0.000040 - momentum: 0.000000
2023-10-17 11:41:55,004 epoch 3 - iter 1386/1546 - loss 0.06375018 - time (sec): 112.17 - samples/sec: 991.19 - lr: 0.000039 - momentum: 0.000000
2023-10-17 11:42:07,262 epoch 3 - iter 1540/1546 - loss 0.06441018 - time (sec): 124.43 - samples/sec: 995.68 - lr: 0.000039 - momentum: 0.000000
2023-10-17 11:42:07,720 ----------------------------------------------------------------------------------------------------
2023-10-17 11:42:07,721 EPOCH 3 done: loss 0.0644 - lr: 0.000039
2023-10-17 11:42:11,059 DEV : loss 0.08243168890476227 - f1-score (micro avg) 0.7925
2023-10-17 11:42:11,089 saving best model
2023-10-17 11:42:12,545 ----------------------------------------------------------------------------------------------------
2023-10-17 11:42:24,575 epoch 4 - iter 154/1546 - loss 0.04181116 - time (sec): 12.03 - samples/sec: 1079.13 - lr: 0.000038 - momentum: 0.000000
2023-10-17 11:42:36,662 epoch 4 - iter 308/1546 - loss 0.04378985 - time (sec): 24.11 - samples/sec: 1093.29 - lr: 0.000038 - momentum: 0.000000
2023-10-17 11:42:48,869 epoch 4 - iter 462/1546 - loss 0.04961536 - time (sec): 36.32 - samples/sec: 1052.62 - lr: 0.000037 - momentum: 0.000000
2023-10-17 11:43:01,116 epoch 4 - iter 616/1546 - loss 0.04737236 - time (sec): 48.57 - samples/sec: 1038.49 - lr: 0.000037 - momentum: 0.000000
2023-10-17 11:43:13,185 epoch 4 - iter 770/1546 - loss 0.04886557 - time (sec): 60.64 - samples/sec: 1033.47 - lr: 0.000036 - momentum: 0.000000
2023-10-17 11:43:25,001 epoch 4 - iter 924/1546 - loss 0.04724064 - time (sec): 72.45 - samples/sec: 1034.00 - lr: 0.000036 - momentum: 0.000000
2023-10-17 11:43:37,214 epoch 4 - iter 1078/1546 - loss 0.04578932 - time (sec): 84.66 - samples/sec: 1023.58 - lr: 0.000035 - momentum: 0.000000
2023-10-17 11:43:49,462 epoch 4 - iter 1232/1546 - loss 0.04446100 - time (sec): 96.91 - samples/sec: 1014.71 - lr: 0.000034 - momentum: 0.000000
2023-10-17 11:44:01,995 epoch 4 - iter 1386/1546 - loss 0.04515610 - time (sec): 109.45 - samples/sec: 1017.77 - lr: 0.000034 - momentum: 0.000000
2023-10-17 11:44:14,511 epoch 4 - iter 1540/1546 - loss 0.04445910 - time (sec): 121.96 - samples/sec: 1014.54 - lr: 0.000033 - momentum: 0.000000
2023-10-17 11:44:14,992 ----------------------------------------------------------------------------------------------------
2023-10-17 11:44:14,992 EPOCH 4 done: loss 0.0444 - lr: 0.000033
2023-10-17 11:44:17,824 DEV : loss 0.10114207118749619 - f1-score (micro avg) 0.7876
2023-10-17 11:44:17,853 ----------------------------------------------------------------------------------------------------
2023-10-17 11:44:30,230 epoch 5 - iter 154/1546 - loss 0.03054617 - time (sec): 12.38 - samples/sec: 989.97 - lr: 0.000033 - momentum: 0.000000
2023-10-17 11:44:42,982 epoch 5 - iter 308/1546 - loss 0.02559762 - time (sec): 25.13 - samples/sec: 1019.18 - lr: 0.000032 - momentum: 0.000000
2023-10-17 11:44:54,994 epoch 5 - iter 462/1546 - loss 0.02893282 - time (sec): 37.14 - samples/sec: 997.64 - lr: 0.000032 - momentum: 0.000000
2023-10-17 11:45:07,478 epoch 5 - iter 616/1546 - loss 0.03055604 - time (sec): 49.62 - samples/sec: 984.47 - lr: 0.000031 - momentum: 0.000000
2023-10-17 11:45:20,590 epoch 5 - iter 770/1546 - loss 0.02918857 - time (sec): 62.74 - samples/sec: 972.04 - lr: 0.000031 - momentum: 0.000000
2023-10-17 11:45:33,162 epoch 5 - iter 924/1546 - loss 0.03004856 - time (sec): 75.31 - samples/sec: 973.68 - lr: 0.000030 - momentum: 0.000000
2023-10-17 11:45:45,194 epoch 5 - iter 1078/1546 - loss 0.02969641 - time (sec): 87.34 - samples/sec: 986.54 - lr: 0.000029 - momentum: 0.000000
2023-10-17 11:45:57,479 epoch 5 - iter 1232/1546 - loss 0.03039943 - time (sec): 99.62 - samples/sec: 992.67 - lr: 0.000029 - momentum: 0.000000
2023-10-17 11:46:09,808 epoch 5 - iter 1386/1546 - loss 0.03111409 - time (sec): 111.95 - samples/sec: 993.99 - lr: 0.000028 - momentum: 0.000000
2023-10-17 11:46:22,446 epoch 5 - iter 1540/1546 - loss 0.03134691 - time (sec): 124.59 - samples/sec: 991.39 - lr: 0.000028 - momentum: 0.000000
2023-10-17 11:46:22,971 ----------------------------------------------------------------------------------------------------
2023-10-17 11:46:22,971 EPOCH 5 done: loss 0.0314 - lr: 0.000028
2023-10-17 11:46:25,840 DEV : loss 0.10744524002075195 - f1-score (micro avg) 0.7699
2023-10-17 11:46:25,868 ----------------------------------------------------------------------------------------------------
2023-10-17 11:46:37,933 epoch 6 - iter 154/1546 - loss 0.02268715 - time (sec): 12.06 - samples/sec: 982.22 - lr: 0.000027 - momentum: 0.000000
2023-10-17 11:46:50,599 epoch 6 - iter 308/1546 - loss 0.02384393 - time (sec): 24.73 - samples/sec: 935.04 - lr: 0.000027 - momentum: 0.000000
2023-10-17 11:47:03,236 epoch 6 - iter 462/1546 - loss 0.02247440 - time (sec): 37.37 - samples/sec: 958.97 - lr: 0.000026 - momentum: 0.000000
2023-10-17 11:47:15,777 epoch 6 - iter 616/1546 - loss 0.02095738 - time (sec): 49.91 - samples/sec: 984.23 - lr: 0.000026 - momentum: 0.000000
2023-10-17 11:47:28,793 epoch 6 - iter 770/1546 - loss 0.02253172 - time (sec): 62.92 - samples/sec: 978.67 - lr: 0.000025 - momentum: 0.000000
2023-10-17 11:47:41,067 epoch 6 - iter 924/1546 - loss 0.02187836 - time (sec): 75.20 - samples/sec: 981.71 - lr: 0.000024 - momentum: 0.000000
2023-10-17 11:47:54,246 epoch 6 - iter 1078/1546 - loss 0.02149000 - time (sec): 88.38 - samples/sec: 978.46 - lr: 0.000024 - momentum: 0.000000
2023-10-17 11:48:06,750 epoch 6 - iter 1232/1546 - loss 0.02129113 - time (sec): 100.88 - samples/sec: 979.18 - lr: 0.000023 - momentum: 0.000000
2023-10-17 11:48:18,878 epoch 6 - iter 1386/1546 - loss 0.02099597 - time (sec): 113.01 - samples/sec: 985.77 - lr: 0.000023 - momentum: 0.000000
2023-10-17 11:48:30,988 epoch 6 - iter 1540/1546 - loss 0.02096803 - time (sec): 125.12 - samples/sec: 989.09 - lr: 0.000022 - momentum: 0.000000
2023-10-17 11:48:31,454 ----------------------------------------------------------------------------------------------------
2023-10-17 11:48:31,454 EPOCH 6 done: loss 0.0209 - lr: 0.000022
2023-10-17 11:48:34,257 DEV : loss 0.12126855552196503 - f1-score (micro avg) 0.761
2023-10-17 11:48:34,286 ----------------------------------------------------------------------------------------------------
2023-10-17 11:48:46,167 epoch 7 - iter 154/1546 - loss 0.01505337 - time (sec): 11.88 - samples/sec: 990.92 - lr: 0.000022 - momentum: 0.000000
2023-10-17 11:48:58,441 epoch 7 - iter 308/1546 - loss 0.01119824 - time (sec): 24.15 - samples/sec: 993.97 - lr: 0.000021 - momentum: 0.000000
2023-10-17 11:49:11,017 epoch 7 - iter 462/1546 - loss 0.01337559 - time (sec): 36.73 - samples/sec: 999.33 - lr: 0.000021 - momentum: 0.000000
2023-10-17 11:49:22,992 epoch 7 - iter 616/1546 - loss 0.01318854 - time (sec): 48.70 - samples/sec: 996.83 - lr: 0.000020 - momentum: 0.000000
2023-10-17 11:49:35,184 epoch 7 - iter 770/1546 - loss 0.01447548 - time (sec): 60.90 - samples/sec: 990.38 - lr: 0.000019 - momentum: 0.000000
2023-10-17 11:49:47,892 epoch 7 - iter 924/1546 - loss 0.01396535 - time (sec): 73.60 - samples/sec: 989.94 - lr: 0.000019 - momentum: 0.000000
2023-10-17 11:50:00,666 epoch 7 - iter 1078/1546 - loss 0.01435574 - time (sec): 86.38 - samples/sec: 1009.08 - lr: 0.000018 - momentum: 0.000000
2023-10-17 11:50:12,719 epoch 7 - iter 1232/1546 - loss 0.01381270 - time (sec): 98.43 - samples/sec: 1007.45 - lr: 0.000018 - momentum: 0.000000
2023-10-17 11:50:25,202 epoch 7 - iter 1386/1546 - loss 0.01396888 - time (sec): 110.91 - samples/sec: 1002.69 - lr: 0.000017 - momentum: 0.000000
2023-10-17 11:50:37,532 epoch 7 - iter 1540/1546 - loss 0.01474867 - time (sec): 123.24 - samples/sec: 1002.50 - lr: 0.000017 - momentum: 0.000000
2023-10-17 11:50:38,024 ----------------------------------------------------------------------------------------------------
2023-10-17 11:50:38,024 EPOCH 7 done: loss 0.0149 - lr: 0.000017
2023-10-17 11:50:41,034 DEV : loss 0.12127351760864258 - f1-score (micro avg) 0.7844
2023-10-17 11:50:41,065 ----------------------------------------------------------------------------------------------------
2023-10-17 11:50:53,540 epoch 8 - iter 154/1546 - loss 0.01576538 - time (sec): 12.47 - samples/sec: 935.69 - lr: 0.000016 - momentum: 0.000000
2023-10-17 11:51:05,757 epoch 8 - iter 308/1546 - loss 0.01252126 - time (sec): 24.69 - samples/sec: 980.49 - lr: 0.000016 - momentum: 0.000000
2023-10-17 11:51:17,966 epoch 8 - iter 462/1546 - loss 0.01184613 - time (sec): 36.90 - samples/sec: 965.34 - lr: 0.000015 - momentum: 0.000000
2023-10-17 11:51:30,153 epoch 8 - iter 616/1546 - loss 0.01041226 - time (sec): 49.09 - samples/sec: 963.73 - lr: 0.000014 - momentum: 0.000000
2023-10-17 11:51:42,569 epoch 8 - iter 770/1546 - loss 0.01031697 - time (sec): 61.50 - samples/sec: 983.14 - lr: 0.000014 - momentum: 0.000000
2023-10-17 11:51:55,445 epoch 8 - iter 924/1546 - loss 0.00954217 - time (sec): 74.38 - samples/sec: 992.65 - lr: 0.000013 - momentum: 0.000000
2023-10-17 11:52:08,027 epoch 8 - iter 1078/1546 - loss 0.00955285 - time (sec): 86.96 - samples/sec: 991.57 - lr: 0.000013 - momentum: 0.000000
2023-10-17 11:52:20,217 epoch 8 - iter 1232/1546 - loss 0.00938038 - time (sec): 99.15 - samples/sec: 999.30 - lr: 0.000012 - momentum: 0.000000
2023-10-17 11:52:32,586 epoch 8 - iter 1386/1546 - loss 0.00972415 - time (sec): 111.52 - samples/sec: 997.28 - lr: 0.000012 - momentum: 0.000000
2023-10-17 11:52:45,367 epoch 8 - iter 1540/1546 - loss 0.00955060 - time (sec): 124.30 - samples/sec: 995.55 - lr: 0.000011 - momentum: 0.000000
2023-10-17 11:52:45,895 ----------------------------------------------------------------------------------------------------
2023-10-17 11:52:45,896 EPOCH 8 done: loss 0.0095 - lr: 0.000011
2023-10-17 11:52:49,157 DEV : loss 0.12450835853815079 - f1-score (micro avg) 0.7808
2023-10-17 11:52:49,195 ----------------------------------------------------------------------------------------------------
2023-10-17 11:53:02,160 epoch 9 - iter 154/1546 - loss 0.00381374 - time (sec): 12.96 - samples/sec: 971.71 - lr: 0.000011 - momentum: 0.000000
2023-10-17 11:53:14,095 epoch 9 - iter 308/1546 - loss 0.00328944 - time (sec): 24.90 - samples/sec: 1050.97 - lr: 0.000010 - momentum: 0.000000
2023-10-17 11:53:26,475 epoch 9 - iter 462/1546 - loss 0.00423502 - time (sec): 37.28 - samples/sec: 1019.71 - lr: 0.000009 - momentum: 0.000000
2023-10-17 11:53:38,868 epoch 9 - iter 616/1546 - loss 0.00420190 - time (sec): 49.67 - samples/sec: 1020.69 - lr: 0.000009 - momentum: 0.000000
2023-10-17 11:53:51,185 epoch 9 - iter 770/1546 - loss 0.00420859 - time (sec): 61.99 - samples/sec: 1016.16 - lr: 0.000008 - momentum: 0.000000
2023-10-17 11:54:03,968 epoch 9 - iter 924/1546 - loss 0.00435302 - time (sec): 74.77 - samples/sec: 997.71 - lr: 0.000008 - momentum: 0.000000
2023-10-17 11:54:16,961 epoch 9 - iter 1078/1546 - loss 0.00505362 - time (sec): 87.76 - samples/sec: 990.07 - lr: 0.000007 - momentum: 0.000000
2023-10-17 11:54:29,580 epoch 9 - iter 1232/1546 - loss 0.00501432 - time (sec): 100.38 - samples/sec: 992.08 - lr: 0.000007 - momentum: 0.000000
2023-10-17 11:54:42,452 epoch 9 - iter 1386/1546 - loss 0.00475313 - time (sec): 113.25 - samples/sec: 994.63 - lr: 0.000006 - momentum: 0.000000
2023-10-17 11:54:55,104 epoch 9 - iter 1540/1546 - loss 0.00511714 - time (sec): 125.91 - samples/sec: 983.02 - lr: 0.000006 - momentum: 0.000000
2023-10-17 11:54:55,576 ----------------------------------------------------------------------------------------------------
2023-10-17 11:54:55,576 EPOCH 9 done: loss 0.0051 - lr: 0.000006
2023-10-17 11:54:58,676 DEV : loss 0.13177739083766937 - f1-score (micro avg) 0.795
2023-10-17 11:54:58,705 saving best model
2023-10-17 11:55:00,160 ----------------------------------------------------------------------------------------------------
2023-10-17 11:55:12,239 epoch 10 - iter 154/1546 - loss 0.00092056 - time (sec): 12.07 - samples/sec: 1002.09 - lr: 0.000005 - momentum: 0.000000
2023-10-17 11:55:24,900 epoch 10 - iter 308/1546 - loss 0.00262761 - time (sec): 24.73 - samples/sec: 1042.52 - lr: 0.000004 - momentum: 0.000000
2023-10-17 11:55:37,158 epoch 10 - iter 462/1546 - loss 0.00275953 - time (sec): 36.99 - samples/sec: 1035.14 - lr: 0.000004 - momentum: 0.000000
2023-10-17 11:55:49,156 epoch 10 - iter 616/1546 - loss 0.00356907 - time (sec): 48.99 - samples/sec: 1023.71 - lr: 0.000003 - momentum: 0.000000
2023-10-17 11:56:01,184 epoch 10 - iter 770/1546 - loss 0.00435324 - time (sec): 61.02 - samples/sec: 1031.31 - lr: 0.000003 - momentum: 0.000000
2023-10-17 11:56:13,414 epoch 10 - iter 924/1546 - loss 0.00397741 - time (sec): 73.25 - samples/sec: 1034.88 - lr: 0.000002 - momentum: 0.000000
2023-10-17 11:56:26,148 epoch 10 - iter 1078/1546 - loss 0.00379041 - time (sec): 85.98 - samples/sec: 1025.04 - lr: 0.000002 - momentum: 0.000000
2023-10-17 11:56:38,575 epoch 10 - iter 1232/1546 - loss 0.00369567 - time (sec): 98.41 - samples/sec: 1005.37 - lr: 0.000001 - momentum: 0.000000
2023-10-17 11:56:51,026 epoch 10 - iter 1386/1546 - loss 0.00339406 - time (sec): 110.86 - samples/sec: 1009.64 - lr: 0.000001 - momentum: 0.000000
2023-10-17 11:57:03,401 epoch 10 - iter 1540/1546 - loss 0.00334146 - time (sec): 123.23 - samples/sec: 1004.56 - lr: 0.000000 - momentum: 0.000000
2023-10-17 11:57:03,853 ----------------------------------------------------------------------------------------------------
2023-10-17 11:57:03,854 EPOCH 10 done: loss 0.0033 - lr: 0.000000
2023-10-17 11:57:06,887 DEV : loss 0.1316857933998108 - f1-score (micro avg) 0.7951
2023-10-17 11:57:06,916 saving best model
2023-10-17 11:57:08,961 ----------------------------------------------------------------------------------------------------
2023-10-17 11:57:08,963 Loading model from best epoch ...
2023-10-17 11:57:11,262 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-17 11:57:20,172
Results:
- F-score (micro) 0.8057
- F-score (macro) 0.7042
- Accuracy 0.6984
By class:
precision recall f1-score support
LOC 0.8320 0.8636 0.8475 946
BUILDING 0.6571 0.6216 0.6389 185
STREET 0.6102 0.6429 0.6261 56
micro avg 0.7961 0.8155 0.8057 1187
macro avg 0.6998 0.7094 0.7042 1187
weighted avg 0.7943 0.8155 0.8045 1187
2023-10-17 11:57:20,172 ----------------------------------------------------------------------------------------------------