Model Card for Model ID

The model is a result of fine-tuning Mistral-7B-v0.1 on a down stream task, in low resourced setting. It is able to translate English sentences to Zulu and Xhosa sentrences.

Model Details

Model Description

dsfsi/mistral-7b-custom_prompt_long_short_31_gpu_days, model, was fine-tuned for 31 GPU days from base model mistralai/Mistral-7B-v0.1. The model was fine-tuned in efforts to improve translation task for large language model in regard to low resourced morphologically rich African languages using custom prompt.

Developed by: Pitso Walter Khoboko, Vukosi Marivate and Joseph Sefara
Funded by [optional]: University of Pretoria and Data Science For Social Impact
Shared by [optional]: Pitso Walter Khoboko
Model type: Sequence-to-sequence model
Language(s) (NLP): English to Zulu and Xhosa
License: cc-by-4.0
Finetuned from model [optional]: mistralai/Mistral-7B-v0.1

Model Sources [optional]

Repository: [More Information Needed]
Paper [optional]: https://www.sciencedirect.com/science/article/pii/S2666827025000325
Demo [optional]: [More Information Needed]

Uses

The model can be used to translate Engslih to Zulu and Xhosa. With further improvement it can be used to translate domain specific infromation from English to Zulu and Xhosa, thus it can be used to get research information that was written in English in the agriculture industry to small scale farmers that speak Zulu and Xhosa. Further, it can be used in the Education industry to teach core subjects in native South African langauges thus can improve pupils' performance in the core subjects.

Direct Use

You can download the model, dsfsi/mistral-7b-custom_prompt_long_short_31_gpu_days, and prompt it to translate English sentences to Zulu and Xhosa sentences.

[More Information Needed]

Downstream Use [optional]

[More Information Needed]

Out-of-Scope Use

The model, dsfsi/mistral-7b-custom_prompt_long_short_31_gpu_days, will not work well for politically bias prompts, propmts that promote sexual bias and giving it a whole document written in English and prompting it to translate to Zulu or Xhosa.

Bias, Risks, and Limitations

The dataset that was used to train the model still had some Zulu and Xhosa sentences having English words, thus without further fine tuning with clean dataset the model should not be used in an official capacity to transalte Engslih to Xhosa.

Recommendations

We recommend if want to use the model in an official capacity, you further fine-tune the model and on clean dataset for more than 31 GPU days.

How to Get Started with the Model

Use the code below to get started with the model.

Training Details

Training Data

nwu-ctext/autshumato
Helsinki-NLP/opus-100
WMT22

The above datasets were collected individually and used to create a multilingual dataset having English to Zulu and Xhosa sentences.

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

bleu: Used to check if the model is translating Zulu and Xhosa words proprely when compared to the fround truth.
f1:evaluates larger linguistic units such as grammatical chunks and syntactic frames, making it more suitable for languages with complex syntactic structures.
G-Eva: uses embeddings to capture the contextual and semantic similarity between hypothesis and reference translations

Results

Eng-Zul: BlueScore-20 F1Score-42 G-Eva-92% Eng-Xh: BlueScore-14 F1Score-38 G-Eva-91%

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: [More Information Needed]
Hours used: [More Information Needed]
Cloud Provider: [More Information Needed]
Compute Region: [More Information Needed]
Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

@article{khoboko2025optimizing, title={Optimizing translation for low-resource languages: Efficient fine-tuning with custom prompt engineering in large language models}, author={Khoboko, Pitso Walter and Marivate, Vukosi and Sefara, Joseph}, journal={Machine Learning with Applications}, volume={20}, pages={100649}, year={2025}, publisher={Elsevier} }

APA:

Khoboko, P. W., Marivate, V., & Sefara, J. (2025). Optimizing translation for low-resource languages: Efficient fine-tuning with custom prompt engineering in large language models. Machine Learning with Applications, 20, 100649.

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

Pitso Walter Khoboko

Model Card Contact

[email protected] (Pitso Walter Khoboko), [email protected] (Vukosi Marivate), [email protected] (Joseph Sefara)

dsfsi
/

mistral-7b-custom_prompt_long_short_31_gpu_days

Model Card for Model ID

Model Details

Model Description

Model Sources [optional]

Uses

Direct Use

Downstream Use [optional]

Out-of-Scope Use

Bias, Risks, and Limitations

Recommendations

How to Get Started with the Model

Training Details

Training Data

Training Procedure

Preprocessing [optional]

Training Hyperparameters

Speeds, Sizes, Times [optional]

Evaluation

Testing Data, Factors & Metrics

Testing Data

Factors

Metrics

Results

Summary

Model Examination [optional]

Environmental Impact

Technical Specifications [optional]

Model Architecture and Objective

Compute Infrastructure

Hardware

Software

Citation [optional]

Glossary [optional]

More Information [optional]

Model Card Authors [optional]

Model Card Contact

Model tree for dsfsi/mistral-7b-custom_prompt_long_short_31_gpu_days

Datasets used to train dsfsi/mistral-7b-custom_prompt_long_short_31_gpu_days