1-800-BAD-CODE
/

sentence_boundary_detection_multilang

sentence boundary detection

token classification

Model card Files Files and versions Community

1-800-BAD-CODE commited on Jan 1, 2023

Commit

7eb80ca

·

1 Parent(s): 8a502c1

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -56,7 +56,8 @@ The BERT encoder is based on the following configuration:
 ## Training
 This model was trained on a personal fork of [NeMo](http://github.com/NVIDIA/NeMo), specifically this [sbd](https://github.com/1-800-BAD-CODE/NeMo/tree/sbd) branch.
-Model was trained on an A100 for ~150k steps with a batch size of 256.
 ### Training Data
 This model was trained on `OpenSubtitles`.

 ## Training
 This model was trained on a personal fork of [NeMo](http://github.com/NVIDIA/NeMo), specifically this [sbd](https://github.com/1-800-BAD-CODE/NeMo/tree/sbd) branch.
+Model was trained on an A100 for \~150k steps with a batch size of 256, with a $3 budget on the [Lambda cloud](https://cloud.lambdalabs.com/).
+Model was allowed to converge with 25M training sentences (1M per language).
 ### Training Data
 This model was trained on `OpenSubtitles`.