stefan-it's picture
Upload folder using huggingface_hub
9efb622
2023-10-17 12:25:36,997 ----------------------------------------------------------------------------------------------------
2023-10-17 12:25:36,998 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 12:25:36,998 ----------------------------------------------------------------------------------------------------
2023-10-17 12:25:36,998 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-17 12:25:36,998 ----------------------------------------------------------------------------------------------------
2023-10-17 12:25:36,998 Train: 7142 sentences
2023-10-17 12:25:36,998 (train_with_dev=False, train_with_test=False)
2023-10-17 12:25:36,998 ----------------------------------------------------------------------------------------------------
2023-10-17 12:25:36,998 Training Params:
2023-10-17 12:25:36,998 - learning_rate: "5e-05"
2023-10-17 12:25:36,998 - mini_batch_size: "8"
2023-10-17 12:25:36,998 - max_epochs: "10"
2023-10-17 12:25:36,998 - shuffle: "True"
2023-10-17 12:25:36,998 ----------------------------------------------------------------------------------------------------
2023-10-17 12:25:36,998 Plugins:
2023-10-17 12:25:36,998 - TensorboardLogger
2023-10-17 12:25:36,998 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 12:25:36,999 ----------------------------------------------------------------------------------------------------
2023-10-17 12:25:36,999 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 12:25:36,999 - metric: "('micro avg', 'f1-score')"
2023-10-17 12:25:36,999 ----------------------------------------------------------------------------------------------------
2023-10-17 12:25:36,999 Computation:
2023-10-17 12:25:36,999 - compute on device: cuda:0
2023-10-17 12:25:36,999 - embedding storage: none
2023-10-17 12:25:36,999 ----------------------------------------------------------------------------------------------------
2023-10-17 12:25:36,999 Model training base path: "hmbench-newseye/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-17 12:25:36,999 ----------------------------------------------------------------------------------------------------
2023-10-17 12:25:36,999 ----------------------------------------------------------------------------------------------------
2023-10-17 12:25:36,999 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 12:25:44,054 epoch 1 - iter 89/893 - loss 2.51270054 - time (sec): 7.05 - samples/sec: 3473.60 - lr: 0.000005 - momentum: 0.000000
2023-10-17 12:25:50,693 epoch 1 - iter 178/893 - loss 1.56797013 - time (sec): 13.69 - samples/sec: 3627.57 - lr: 0.000010 - momentum: 0.000000
2023-10-17 12:25:57,490 epoch 1 - iter 267/893 - loss 1.16082935 - time (sec): 20.49 - samples/sec: 3664.35 - lr: 0.000015 - momentum: 0.000000
2023-10-17 12:26:04,127 epoch 1 - iter 356/893 - loss 0.95358020 - time (sec): 27.13 - samples/sec: 3623.69 - lr: 0.000020 - momentum: 0.000000
2023-10-17 12:26:11,127 epoch 1 - iter 445/893 - loss 0.80942724 - time (sec): 34.13 - samples/sec: 3607.41 - lr: 0.000025 - momentum: 0.000000
2023-10-17 12:26:18,044 epoch 1 - iter 534/893 - loss 0.70411281 - time (sec): 41.04 - samples/sec: 3613.53 - lr: 0.000030 - momentum: 0.000000
2023-10-17 12:26:24,723 epoch 1 - iter 623/893 - loss 0.63046341 - time (sec): 47.72 - samples/sec: 3617.11 - lr: 0.000035 - momentum: 0.000000
2023-10-17 12:26:31,917 epoch 1 - iter 712/893 - loss 0.56581935 - time (sec): 54.92 - samples/sec: 3609.36 - lr: 0.000040 - momentum: 0.000000
2023-10-17 12:26:39,013 epoch 1 - iter 801/893 - loss 0.52137910 - time (sec): 62.01 - samples/sec: 3587.67 - lr: 0.000045 - momentum: 0.000000
2023-10-17 12:26:46,067 epoch 1 - iter 890/893 - loss 0.48321528 - time (sec): 69.07 - samples/sec: 3590.42 - lr: 0.000050 - momentum: 0.000000
2023-10-17 12:26:46,259 ----------------------------------------------------------------------------------------------------
2023-10-17 12:26:46,260 EPOCH 1 done: loss 0.4823 - lr: 0.000050
2023-10-17 12:26:49,326 DEV : loss 0.10747521370649338 - f1-score (micro avg) 0.7346
2023-10-17 12:26:49,342 saving best model
2023-10-17 12:26:49,698 ----------------------------------------------------------------------------------------------------
2023-10-17 12:26:55,961 epoch 2 - iter 89/893 - loss 0.13115492 - time (sec): 6.26 - samples/sec: 3759.94 - lr: 0.000049 - momentum: 0.000000
2023-10-17 12:27:02,936 epoch 2 - iter 178/893 - loss 0.12309626 - time (sec): 13.24 - samples/sec: 3673.41 - lr: 0.000049 - momentum: 0.000000
2023-10-17 12:27:10,293 epoch 2 - iter 267/893 - loss 0.11730129 - time (sec): 20.59 - samples/sec: 3585.41 - lr: 0.000048 - momentum: 0.000000
2023-10-17 12:27:17,302 epoch 2 - iter 356/893 - loss 0.11326307 - time (sec): 27.60 - samples/sec: 3571.89 - lr: 0.000048 - momentum: 0.000000
2023-10-17 12:27:24,123 epoch 2 - iter 445/893 - loss 0.11005394 - time (sec): 34.42 - samples/sec: 3578.68 - lr: 0.000047 - momentum: 0.000000
2023-10-17 12:27:31,020 epoch 2 - iter 534/893 - loss 0.11030052 - time (sec): 41.32 - samples/sec: 3591.15 - lr: 0.000047 - momentum: 0.000000
2023-10-17 12:27:38,052 epoch 2 - iter 623/893 - loss 0.11047052 - time (sec): 48.35 - samples/sec: 3563.48 - lr: 0.000046 - momentum: 0.000000
2023-10-17 12:27:44,943 epoch 2 - iter 712/893 - loss 0.10912533 - time (sec): 55.24 - samples/sec: 3573.03 - lr: 0.000046 - momentum: 0.000000
2023-10-17 12:27:52,179 epoch 2 - iter 801/893 - loss 0.10854466 - time (sec): 62.48 - samples/sec: 3599.44 - lr: 0.000045 - momentum: 0.000000
2023-10-17 12:27:59,024 epoch 2 - iter 890/893 - loss 0.10807744 - time (sec): 69.32 - samples/sec: 3576.74 - lr: 0.000044 - momentum: 0.000000
2023-10-17 12:27:59,280 ----------------------------------------------------------------------------------------------------
2023-10-17 12:27:59,280 EPOCH 2 done: loss 0.1081 - lr: 0.000044
2023-10-17 12:28:03,929 DEV : loss 0.10729347169399261 - f1-score (micro avg) 0.7712
2023-10-17 12:28:03,944 saving best model
2023-10-17 12:28:04,550 ----------------------------------------------------------------------------------------------------
2023-10-17 12:28:11,804 epoch 3 - iter 89/893 - loss 0.07385466 - time (sec): 7.25 - samples/sec: 3586.68 - lr: 0.000044 - momentum: 0.000000
2023-10-17 12:28:18,766 epoch 3 - iter 178/893 - loss 0.06943125 - time (sec): 14.21 - samples/sec: 3555.12 - lr: 0.000043 - momentum: 0.000000
2023-10-17 12:28:26,515 epoch 3 - iter 267/893 - loss 0.07106101 - time (sec): 21.96 - samples/sec: 3470.07 - lr: 0.000043 - momentum: 0.000000
2023-10-17 12:28:33,657 epoch 3 - iter 356/893 - loss 0.07024538 - time (sec): 29.11 - samples/sec: 3536.22 - lr: 0.000042 - momentum: 0.000000
2023-10-17 12:28:40,826 epoch 3 - iter 445/893 - loss 0.07099337 - time (sec): 36.27 - samples/sec: 3534.14 - lr: 0.000042 - momentum: 0.000000
2023-10-17 12:28:47,363 epoch 3 - iter 534/893 - loss 0.07156220 - time (sec): 42.81 - samples/sec: 3553.29 - lr: 0.000041 - momentum: 0.000000
2023-10-17 12:28:54,140 epoch 3 - iter 623/893 - loss 0.07248512 - time (sec): 49.59 - samples/sec: 3563.39 - lr: 0.000041 - momentum: 0.000000
2023-10-17 12:29:00,842 epoch 3 - iter 712/893 - loss 0.07186031 - time (sec): 56.29 - samples/sec: 3564.65 - lr: 0.000040 - momentum: 0.000000
2023-10-17 12:29:07,960 epoch 3 - iter 801/893 - loss 0.07047222 - time (sec): 63.41 - samples/sec: 3548.23 - lr: 0.000039 - momentum: 0.000000
2023-10-17 12:29:14,213 epoch 3 - iter 890/893 - loss 0.07129680 - time (sec): 69.66 - samples/sec: 3560.12 - lr: 0.000039 - momentum: 0.000000
2023-10-17 12:29:14,438 ----------------------------------------------------------------------------------------------------
2023-10-17 12:29:14,438 EPOCH 3 done: loss 0.0712 - lr: 0.000039
2023-10-17 12:29:18,643 DEV : loss 0.10302536189556122 - f1-score (micro avg) 0.8027
2023-10-17 12:29:18,660 saving best model
2023-10-17 12:29:19,131 ----------------------------------------------------------------------------------------------------
2023-10-17 12:29:26,034 epoch 4 - iter 89/893 - loss 0.04394687 - time (sec): 6.90 - samples/sec: 3629.74 - lr: 0.000038 - momentum: 0.000000
2023-10-17 12:29:32,980 epoch 4 - iter 178/893 - loss 0.04673629 - time (sec): 13.85 - samples/sec: 3600.47 - lr: 0.000038 - momentum: 0.000000
2023-10-17 12:29:40,460 epoch 4 - iter 267/893 - loss 0.04981548 - time (sec): 21.33 - samples/sec: 3539.04 - lr: 0.000037 - momentum: 0.000000
2023-10-17 12:29:47,170 epoch 4 - iter 356/893 - loss 0.05110971 - time (sec): 28.04 - samples/sec: 3559.54 - lr: 0.000037 - momentum: 0.000000
2023-10-17 12:29:54,331 epoch 4 - iter 445/893 - loss 0.05026700 - time (sec): 35.20 - samples/sec: 3546.87 - lr: 0.000036 - momentum: 0.000000
2023-10-17 12:30:01,406 epoch 4 - iter 534/893 - loss 0.05089379 - time (sec): 42.27 - samples/sec: 3562.51 - lr: 0.000036 - momentum: 0.000000
2023-10-17 12:30:08,585 epoch 4 - iter 623/893 - loss 0.04964456 - time (sec): 49.45 - samples/sec: 3552.29 - lr: 0.000035 - momentum: 0.000000
2023-10-17 12:30:15,076 epoch 4 - iter 712/893 - loss 0.04857108 - time (sec): 55.94 - samples/sec: 3550.83 - lr: 0.000034 - momentum: 0.000000
2023-10-17 12:30:21,863 epoch 4 - iter 801/893 - loss 0.04927161 - time (sec): 62.73 - samples/sec: 3550.83 - lr: 0.000034 - momentum: 0.000000
2023-10-17 12:30:28,909 epoch 4 - iter 890/893 - loss 0.04890995 - time (sec): 69.77 - samples/sec: 3556.09 - lr: 0.000033 - momentum: 0.000000
2023-10-17 12:30:29,111 ----------------------------------------------------------------------------------------------------
2023-10-17 12:30:29,112 EPOCH 4 done: loss 0.0489 - lr: 0.000033
2023-10-17 12:30:33,260 DEV : loss 0.1494728922843933 - f1-score (micro avg) 0.7642
2023-10-17 12:30:33,277 ----------------------------------------------------------------------------------------------------
2023-10-17 12:30:39,784 epoch 5 - iter 89/893 - loss 0.03866759 - time (sec): 6.51 - samples/sec: 3762.35 - lr: 0.000033 - momentum: 0.000000
2023-10-17 12:30:46,083 epoch 5 - iter 178/893 - loss 0.03849278 - time (sec): 12.81 - samples/sec: 3724.90 - lr: 0.000032 - momentum: 0.000000
2023-10-17 12:30:53,079 epoch 5 - iter 267/893 - loss 0.04490770 - time (sec): 19.80 - samples/sec: 3670.89 - lr: 0.000032 - momentum: 0.000000
2023-10-17 12:31:00,158 epoch 5 - iter 356/893 - loss 0.04415524 - time (sec): 26.88 - samples/sec: 3637.83 - lr: 0.000031 - momentum: 0.000000
2023-10-17 12:31:07,534 epoch 5 - iter 445/893 - loss 0.04553621 - time (sec): 34.26 - samples/sec: 3633.96 - lr: 0.000031 - momentum: 0.000000
2023-10-17 12:31:14,613 epoch 5 - iter 534/893 - loss 0.04328694 - time (sec): 41.34 - samples/sec: 3609.67 - lr: 0.000030 - momentum: 0.000000
2023-10-17 12:31:22,041 epoch 5 - iter 623/893 - loss 0.04161932 - time (sec): 48.76 - samples/sec: 3587.47 - lr: 0.000029 - momentum: 0.000000
2023-10-17 12:31:28,696 epoch 5 - iter 712/893 - loss 0.04050119 - time (sec): 55.42 - samples/sec: 3603.61 - lr: 0.000029 - momentum: 0.000000
2023-10-17 12:31:35,963 epoch 5 - iter 801/893 - loss 0.04014660 - time (sec): 62.69 - samples/sec: 3587.44 - lr: 0.000028 - momentum: 0.000000
2023-10-17 12:31:42,520 epoch 5 - iter 890/893 - loss 0.03909834 - time (sec): 69.24 - samples/sec: 3583.50 - lr: 0.000028 - momentum: 0.000000
2023-10-17 12:31:42,686 ----------------------------------------------------------------------------------------------------
2023-10-17 12:31:42,686 EPOCH 5 done: loss 0.0392 - lr: 0.000028
2023-10-17 12:31:47,303 DEV : loss 0.1533377468585968 - f1-score (micro avg) 0.81
2023-10-17 12:31:47,319 saving best model
2023-10-17 12:31:47,795 ----------------------------------------------------------------------------------------------------
2023-10-17 12:31:54,920 epoch 6 - iter 89/893 - loss 0.02756819 - time (sec): 7.12 - samples/sec: 3532.05 - lr: 0.000027 - momentum: 0.000000
2023-10-17 12:32:01,948 epoch 6 - iter 178/893 - loss 0.02664455 - time (sec): 14.15 - samples/sec: 3607.81 - lr: 0.000027 - momentum: 0.000000
2023-10-17 12:32:08,603 epoch 6 - iter 267/893 - loss 0.02870236 - time (sec): 20.80 - samples/sec: 3631.55 - lr: 0.000026 - momentum: 0.000000
2023-10-17 12:32:15,508 epoch 6 - iter 356/893 - loss 0.02957670 - time (sec): 27.71 - samples/sec: 3612.67 - lr: 0.000026 - momentum: 0.000000
2023-10-17 12:32:22,842 epoch 6 - iter 445/893 - loss 0.02928950 - time (sec): 35.04 - samples/sec: 3582.35 - lr: 0.000025 - momentum: 0.000000
2023-10-17 12:32:30,106 epoch 6 - iter 534/893 - loss 0.03036133 - time (sec): 42.31 - samples/sec: 3591.43 - lr: 0.000024 - momentum: 0.000000
2023-10-17 12:32:36,998 epoch 6 - iter 623/893 - loss 0.03000592 - time (sec): 49.20 - samples/sec: 3584.54 - lr: 0.000024 - momentum: 0.000000
2023-10-17 12:32:43,597 epoch 6 - iter 712/893 - loss 0.02940656 - time (sec): 55.80 - samples/sec: 3586.57 - lr: 0.000023 - momentum: 0.000000
2023-10-17 12:32:50,304 epoch 6 - iter 801/893 - loss 0.02915799 - time (sec): 62.50 - samples/sec: 3577.00 - lr: 0.000023 - momentum: 0.000000
2023-10-17 12:32:57,362 epoch 6 - iter 890/893 - loss 0.02938837 - time (sec): 69.56 - samples/sec: 3565.24 - lr: 0.000022 - momentum: 0.000000
2023-10-17 12:32:57,557 ----------------------------------------------------------------------------------------------------
2023-10-17 12:32:57,558 EPOCH 6 done: loss 0.0294 - lr: 0.000022
2023-10-17 12:33:02,212 DEV : loss 0.20796504616737366 - f1-score (micro avg) 0.8062
2023-10-17 12:33:02,229 ----------------------------------------------------------------------------------------------------
2023-10-17 12:33:08,958 epoch 7 - iter 89/893 - loss 0.01719836 - time (sec): 6.73 - samples/sec: 3465.65 - lr: 0.000022 - momentum: 0.000000
2023-10-17 12:33:16,469 epoch 7 - iter 178/893 - loss 0.01793337 - time (sec): 14.24 - samples/sec: 3498.55 - lr: 0.000021 - momentum: 0.000000
2023-10-17 12:33:23,220 epoch 7 - iter 267/893 - loss 0.01837316 - time (sec): 20.99 - samples/sec: 3516.43 - lr: 0.000021 - momentum: 0.000000
2023-10-17 12:33:30,566 epoch 7 - iter 356/893 - loss 0.01931535 - time (sec): 28.34 - samples/sec: 3552.60 - lr: 0.000020 - momentum: 0.000000
2023-10-17 12:33:37,346 epoch 7 - iter 445/893 - loss 0.01987539 - time (sec): 35.12 - samples/sec: 3592.41 - lr: 0.000019 - momentum: 0.000000
2023-10-17 12:33:43,813 epoch 7 - iter 534/893 - loss 0.02129779 - time (sec): 41.58 - samples/sec: 3572.74 - lr: 0.000019 - momentum: 0.000000
2023-10-17 12:33:50,667 epoch 7 - iter 623/893 - loss 0.02255359 - time (sec): 48.44 - samples/sec: 3551.20 - lr: 0.000018 - momentum: 0.000000
2023-10-17 12:33:57,376 epoch 7 - iter 712/893 - loss 0.02186109 - time (sec): 55.15 - samples/sec: 3557.59 - lr: 0.000018 - momentum: 0.000000
2023-10-17 12:34:04,807 epoch 7 - iter 801/893 - loss 0.02137142 - time (sec): 62.58 - samples/sec: 3550.20 - lr: 0.000017 - momentum: 0.000000
2023-10-17 12:34:11,865 epoch 7 - iter 890/893 - loss 0.02088132 - time (sec): 69.63 - samples/sec: 3563.55 - lr: 0.000017 - momentum: 0.000000
2023-10-17 12:34:12,047 ----------------------------------------------------------------------------------------------------
2023-10-17 12:34:12,047 EPOCH 7 done: loss 0.0208 - lr: 0.000017
2023-10-17 12:34:16,135 DEV : loss 0.20536133646965027 - f1-score (micro avg) 0.8148
2023-10-17 12:34:16,151 saving best model
2023-10-17 12:34:16,630 ----------------------------------------------------------------------------------------------------
2023-10-17 12:34:23,808 epoch 8 - iter 89/893 - loss 0.01542112 - time (sec): 7.18 - samples/sec: 3650.48 - lr: 0.000016 - momentum: 0.000000
2023-10-17 12:34:30,937 epoch 8 - iter 178/893 - loss 0.01339288 - time (sec): 14.30 - samples/sec: 3587.50 - lr: 0.000016 - momentum: 0.000000
2023-10-17 12:34:37,882 epoch 8 - iter 267/893 - loss 0.01315703 - time (sec): 21.25 - samples/sec: 3583.44 - lr: 0.000015 - momentum: 0.000000
2023-10-17 12:34:44,510 epoch 8 - iter 356/893 - loss 0.01462132 - time (sec): 27.88 - samples/sec: 3579.37 - lr: 0.000014 - momentum: 0.000000
2023-10-17 12:34:51,613 epoch 8 - iter 445/893 - loss 0.01468807 - time (sec): 34.98 - samples/sec: 3567.31 - lr: 0.000014 - momentum: 0.000000
2023-10-17 12:34:58,772 epoch 8 - iter 534/893 - loss 0.01541694 - time (sec): 42.14 - samples/sec: 3563.31 - lr: 0.000013 - momentum: 0.000000
2023-10-17 12:35:05,922 epoch 8 - iter 623/893 - loss 0.01464883 - time (sec): 49.29 - samples/sec: 3564.81 - lr: 0.000013 - momentum: 0.000000
2023-10-17 12:35:13,086 epoch 8 - iter 712/893 - loss 0.01436159 - time (sec): 56.45 - samples/sec: 3586.05 - lr: 0.000012 - momentum: 0.000000
2023-10-17 12:35:19,554 epoch 8 - iter 801/893 - loss 0.01472488 - time (sec): 62.92 - samples/sec: 3591.16 - lr: 0.000012 - momentum: 0.000000
2023-10-17 12:35:25,816 epoch 8 - iter 890/893 - loss 0.01463998 - time (sec): 69.18 - samples/sec: 3585.66 - lr: 0.000011 - momentum: 0.000000
2023-10-17 12:35:26,039 ----------------------------------------------------------------------------------------------------
2023-10-17 12:35:26,040 EPOCH 8 done: loss 0.0146 - lr: 0.000011
2023-10-17 12:35:30,693 DEV : loss 0.2075384557247162 - f1-score (micro avg) 0.8237
2023-10-17 12:35:30,710 saving best model
2023-10-17 12:35:31,198 ----------------------------------------------------------------------------------------------------
2023-10-17 12:35:38,366 epoch 9 - iter 89/893 - loss 0.01253730 - time (sec): 7.17 - samples/sec: 3569.12 - lr: 0.000011 - momentum: 0.000000
2023-10-17 12:35:45,712 epoch 9 - iter 178/893 - loss 0.01262600 - time (sec): 14.51 - samples/sec: 3506.66 - lr: 0.000010 - momentum: 0.000000
2023-10-17 12:35:52,657 epoch 9 - iter 267/893 - loss 0.01162607 - time (sec): 21.46 - samples/sec: 3567.08 - lr: 0.000009 - momentum: 0.000000
2023-10-17 12:35:59,395 epoch 9 - iter 356/893 - loss 0.01094062 - time (sec): 28.20 - samples/sec: 3564.61 - lr: 0.000009 - momentum: 0.000000
2023-10-17 12:36:06,132 epoch 9 - iter 445/893 - loss 0.01073676 - time (sec): 34.93 - samples/sec: 3590.56 - lr: 0.000008 - momentum: 0.000000
2023-10-17 12:36:12,723 epoch 9 - iter 534/893 - loss 0.01115983 - time (sec): 41.52 - samples/sec: 3613.46 - lr: 0.000008 - momentum: 0.000000
2023-10-17 12:36:19,548 epoch 9 - iter 623/893 - loss 0.01126760 - time (sec): 48.35 - samples/sec: 3607.99 - lr: 0.000007 - momentum: 0.000000
2023-10-17 12:36:26,185 epoch 9 - iter 712/893 - loss 0.01160709 - time (sec): 54.99 - samples/sec: 3597.68 - lr: 0.000007 - momentum: 0.000000
2023-10-17 12:36:33,180 epoch 9 - iter 801/893 - loss 0.01166356 - time (sec): 61.98 - samples/sec: 3599.54 - lr: 0.000006 - momentum: 0.000000
2023-10-17 12:36:40,480 epoch 9 - iter 890/893 - loss 0.01145458 - time (sec): 69.28 - samples/sec: 3581.17 - lr: 0.000006 - momentum: 0.000000
2023-10-17 12:36:40,683 ----------------------------------------------------------------------------------------------------
2023-10-17 12:36:40,683 EPOCH 9 done: loss 0.0115 - lr: 0.000006
2023-10-17 12:36:45,383 DEV : loss 0.22258678078651428 - f1-score (micro avg) 0.8193
2023-10-17 12:36:45,400 ----------------------------------------------------------------------------------------------------
2023-10-17 12:36:52,507 epoch 10 - iter 89/893 - loss 0.00776903 - time (sec): 7.11 - samples/sec: 3560.36 - lr: 0.000005 - momentum: 0.000000
2023-10-17 12:36:59,560 epoch 10 - iter 178/893 - loss 0.00915230 - time (sec): 14.16 - samples/sec: 3535.40 - lr: 0.000004 - momentum: 0.000000
2023-10-17 12:37:06,350 epoch 10 - iter 267/893 - loss 0.00784800 - time (sec): 20.95 - samples/sec: 3548.45 - lr: 0.000004 - momentum: 0.000000
2023-10-17 12:37:13,531 epoch 10 - iter 356/893 - loss 0.00767143 - time (sec): 28.13 - samples/sec: 3559.37 - lr: 0.000003 - momentum: 0.000000
2023-10-17 12:37:20,242 epoch 10 - iter 445/893 - loss 0.00765126 - time (sec): 34.84 - samples/sec: 3582.01 - lr: 0.000003 - momentum: 0.000000
2023-10-17 12:37:27,267 epoch 10 - iter 534/893 - loss 0.00790874 - time (sec): 41.87 - samples/sec: 3545.89 - lr: 0.000002 - momentum: 0.000000
2023-10-17 12:37:33,931 epoch 10 - iter 623/893 - loss 0.00716107 - time (sec): 48.53 - samples/sec: 3552.59 - lr: 0.000002 - momentum: 0.000000
2023-10-17 12:37:40,932 epoch 10 - iter 712/893 - loss 0.00685343 - time (sec): 55.53 - samples/sec: 3536.99 - lr: 0.000001 - momentum: 0.000000
2023-10-17 12:37:47,813 epoch 10 - iter 801/893 - loss 0.00683376 - time (sec): 62.41 - samples/sec: 3547.70 - lr: 0.000001 - momentum: 0.000000
2023-10-17 12:37:55,138 epoch 10 - iter 890/893 - loss 0.00691368 - time (sec): 69.74 - samples/sec: 3557.99 - lr: 0.000000 - momentum: 0.000000
2023-10-17 12:37:55,342 ----------------------------------------------------------------------------------------------------
2023-10-17 12:37:55,342 EPOCH 10 done: loss 0.0069 - lr: 0.000000
2023-10-17 12:37:59,528 DEV : loss 0.22205151617527008 - f1-score (micro avg) 0.8235
2023-10-17 12:37:59,904 ----------------------------------------------------------------------------------------------------
2023-10-17 12:37:59,906 Loading model from best epoch ...
2023-10-17 12:38:01,386 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-17 12:38:10,955
Results:
- F-score (micro) 0.703
- F-score (macro) 0.6203
- Accuracy 0.56
By class:
precision recall f1-score support
LOC 0.7466 0.6995 0.7223 1095
PER 0.7818 0.7648 0.7732 1012
ORG 0.4431 0.5994 0.5095 357
HumanProd 0.3922 0.6061 0.4762 33
micro avg 0.6957 0.7105 0.7030 2497
macro avg 0.5909 0.6675 0.6203 2497
weighted avg 0.7128 0.7105 0.7093 2497
2023-10-17 12:38:10,955 ----------------------------------------------------------------------------------------------------