Inclusively Classification Model

This model is an Italian classification model fine-tuned from the Italian BERT model for the classification of inclusive language in Italian.

It has been trained to detect three classes:

  • inclusive: the sentence is inclusive (e.g. "Il personale docente e non docente")
  • not_inclusive: the sentence is not inclusive (e.g. "I professori")
  • not_pertinent: the sentence is not pertinent to the task (e.g. "La scuola è chiusa")

Training data

The model has been trained on a dataset containing:

  • 8580 training sentences
  • 1073 validation sentences
  • 1072 test sentences

The data collection has been manually annotated by experts in the field of inclusive language (dataset is not publicly available yet).

Training procedure

The model has been fine-tuned from the Italian BERT model using the following hyperparameters:

  • max_length: 128
  • batch_size: 128
  • learning_rate: 5e-5
  • warmup_steps: 500
  • epochs: 10 (best model is selected based on validation accuracy)
  • optimizer: AdamW

Evaluation results

The model has been evaluated on the test set and obtained the following results:

Model Accuracy Inclusive F1 Not inclusive F1 Not pertinent F1
TF-IDF + MLP 0.68 0.63 0.69 0.66
TF-IDF + SVM 0.61 0.53 0.60 0.78
TF-IDF + GB 0.74 0.74 0.76 0.72
multilingual 0.86 0.88 0.89 0.83
This 0.89 0.88 0.92 0.85

The model has been compared with a multilingual model trained on the same data and obtained better results.

Citation

If you use this model, please make sure to cite the following papers:

Main paper:

@article{10.1145/3729237,
author = {Greco, Salvatore and La Quatra, Moreno and Cagliero, Luca and Cerquitelli, Tania},
title = {Towards AI-Assisted Inclusive Language Writing in Italian Formal Communications},
year = {2025},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
issn = {2157-6904},
url = {https://doi.org/10.1145/3729237},
doi = {10.1145/3729237},
note = {Just Accepted},
journal = {ACM Trans. Intell. Syst. Technol.},
month = apr,
}

Demo paper:

@InProceedings{PKDD23_inclusively,
author="La Quatra, Moreno
and Greco, Salvatore
and Cagliero, Luca
and Cerquitelli, Tania",
title="Inclusively: An AI-Based Assistant for Inclusive Writing",
booktitle="Machine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track",
year="2023",
publisher="Springer Nature Switzerland",
address="Cham",
pages="361--365",
isbn="978-3-031-43430-3",
doi="10.1007/978-3-031-43430-3_31"
}
Downloads last month
54
Safetensors
Model size
111M params
Tensor type
I64
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support