|
--- |
|
datasets: |
|
- csebuetnlp/xlsum |
|
language: |
|
- am |
|
- ar |
|
- az |
|
- bn |
|
- my |
|
- zh |
|
- en |
|
- fr |
|
- gu |
|
- ha |
|
- hi |
|
- ig |
|
- id |
|
- ja |
|
- rn |
|
- ko |
|
- ky |
|
- mr |
|
- ne |
|
- om |
|
- ps |
|
- fa |
|
- pcm |
|
- pt |
|
- pa |
|
- ru |
|
- gd |
|
- sr |
|
- si |
|
- so |
|
- es |
|
- sw |
|
- ta |
|
- te |
|
- th |
|
- ti |
|
- tr |
|
- uk |
|
- ur |
|
- uz |
|
- vi |
|
- cy |
|
- yo |
|
multilinguality: |
|
- multilingual |
|
pipeline_tag: summarization |
|
--- |
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
This model is fine-tuned version of [DeltaLM-base](https://huggingface.co/nguyenvulebinh/deltalm-base) on the [XLSum dataset](https://huggingface.co/datasets/csebuetnlp/xlsum) |
|
, aiming for abstractive multilingual summarization. |
|
|
|
It achieves the following results on the evaluation set: |
|
- rouge-1: 18.2 |
|
- rouge-2: 7.6 |
|
- rouge-l: 14.9 |
|
- rouge-lsum: 14.7 |
|
|
|
## Dataset desctiption |
|
[XLSum dataset](https://huggingface.co/datasets/csebuetnlp/xlsum) is a comprehensive and diverse dataset comprising 1.35 million professionally annotated article-summary pairs from BBC, extracted using a set of carefully designed heuristics. The dataset covers 45 languages ranging from low to high-resource, for many of which no public dataset is currently available. XL-Sum is highly abstractive, concise, and of high quality, as indicated by human and intrinsic evaluation. |
|
|
|
## Languages |
|
- amharic |
|
- arabic |
|
- azerbaijani |
|
- bengali |
|
- burmese |
|
- chinese_simplified |
|
- chinese_traditional |
|
- english |
|
- french |
|
- gujarati |
|
- hausa |
|
- hindi |
|
- igbo |
|
- indonesian |
|
- japanese |
|
- kirundi |
|
- korean |
|
- kyrgyz |
|
- marathi |
|
- nepali |
|
- oromo |
|
- pashto |
|
- persian |
|
- pidgin |
|
- portuguese |
|
- punjabi |
|
- russian |
|
- scottish_gaelic |
|
- serbian_cyrillic |
|
- serbian_latin |
|
- sinhala |
|
- somali |
|
- spanish |
|
- swahili |
|
- tamil |
|
- telugu |
|
- thai |
|
- tigrinya |
|
- turkish |
|
- ukrainian |
|
- urdu |
|
- uzbek |
|
- vietnamese |
|
- welsh |
|
- yoruba |
|
|
|
## Training hyperparameters |
|
|
|
The model trained with a p4d.24xlarge instance on aws sagemaker, with the following config: |
|
- model: deltalm base |
|
- batch size: 8 |
|
- learning rate: 1e-5 |
|
- number of epochs: 3 |
|
- warmup steps: 500 |
|
- weight decay: 0.01 |