|
--- |
|
library_name: transformers |
|
pipeline_tag: translation |
|
license: mit |
|
datasets: |
|
- westenfelder/NL2SH-ALFA |
|
language: |
|
- en |
|
base_model: Qwen/Qwen2.5-Coder-7B-Instruct |
|
model-index: |
|
- name: Qwen2.5-Coder-7B-Instruct-NL2SH |
|
results: |
|
- task: |
|
type: translation |
|
name: Natural Language to Bash Translation |
|
dataset: |
|
type: translation |
|
name: NL2SH-ALFA |
|
split: test |
|
metrics: |
|
- type: accuracy |
|
value: 0.51 |
|
name: InterCode-ALFA |
|
source: |
|
name: InterCode-ALFA |
|
url: https://arxiv.org/abs/2502.06858 |
|
--- |
|
|
|
# Model Card for Qwen2.5-Coder-7B-Instruct-NL2SH |
|
This model translates natural language (English) instructions to Bash commands. |
|
|
|
## Model Details |
|
### Model Description |
|
This model is a fine-tuned version of the [Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) model trained on the [NL2SH-ALFA](https://huggingface.co/datasets/westenfelder/NL2SH-ALFA) dataset for the task of natural language to Bash translation (NL2SH). For more information, please refer to the [paper](https://arxiv.org/abs/2502.06858). |
|
- **Developed by:** [Anyscale Learning For All (ALFA) Group at MIT-CSAIL](https://alfagroup.csail.mit.edu/) |
|
- **Language:** English |
|
- **License:** MIT License |
|
- **Finetuned from model:** [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) |
|
|
|
### Model Sources |
|
- **Repository:** [GitHub Repo](https://github.com/westenfelder/NL2SH) |
|
- **Paper:** [LLM-Supported Natural Language to Bash Translation](https://arxiv.org/abs/2502.06858) |
|
|
|
## Uses |
|
### Direct Use |
|
This model is intended for research on machine translation. The model can also be used as an educational resource for learning Bash. |
|
|
|
### Out-of-Scope Use |
|
This model should not be used in production or automated systems without human verification. |
|
|
|
**Considerations for use in high-risk environments:** This model should not be used in high-risk environments due to its low accuracy and potential for generating harmful commands. |
|
|
|
## Bias, Risks, and Limitations |
|
This model has a tendency to generate overly complex and incorrect Bash commands. It may produce harmful commands that delete data or corrupt a system. This model is not intended for natural languages other than English, scripting languages or than Bash, or multi-line Bash scripts. |
|
|
|
### Recommendations |
|
Users are encouraged to use this model as Bash reference tool and should not execute commands without verification. |
|
|
|
## How to Get Started with the Model |
|
Use the code below to get started with the model. |
|
```python |
|
import torch |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
def translate(prompt): |
|
model_name = "westenfelder/Qwen2.5-Coder-7B-Instruct-NL2SH" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name, clean_up_tokenization_spaces=False) |
|
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="cuda", torch_dtype=torch.bfloat16) |
|
|
|
messages = [ |
|
{"role": "system", "content": "Your task is to translate a natural language instruction to a Bash command. You will receive an instruction in English and output a Bash command that can be run in a Linux terminal."}, |
|
{"role": "user", "content": f"{prompt}"}, |
|
] |
|
|
|
tokens = tokenizer.apply_chat_template( |
|
messages, |
|
add_generation_prompt=True, |
|
tokenize=True, |
|
return_tensors="pt" |
|
).to(model.device) |
|
|
|
attention_mask = torch.ones_like(tokens) |
|
|
|
outputs = model.generate( |
|
tokens, |
|
attention_mask=attention_mask, |
|
max_new_tokens=100, |
|
do_sample=False, |
|
temperature=None, |
|
top_p=None, |
|
top_k=None, |
|
) |
|
|
|
response = outputs[0][tokens.shape[-1]:] |
|
return tokenizer.decode(response, skip_special_tokens=True) |
|
|
|
|
|
nl = "List files in the /workspace directory that were accessed over an hour ago." |
|
sh = translate(nl) |
|
print(sh) |
|
``` |
|
|
|
## Training Details |
|
### Training Data |
|
This model was trained on the [NL2SH-ALFA](https://huggingface.co/datasets/westenfelder/NL2SH-ALFA) dataset. |
|
|
|
### Training Procedure |
|
Please refer to section 4.1 and 4.3.4 of the [paper](https://arxiv.org/abs/2502.06858) for information about data pre-processing, training hyper-parameters and hardware. |
|
|
|
## Evaluation |
|
This model was evaluated on the [NL2SH-ALFA](https://huggingface.co/datasets/westenfelder/NL2SH-ALFA) test set using the [InterCode-ALFA](https://github.com/westenfelder/InterCode-ALFA) benchmark. |
|
|
|
### Results |
|
This model achieved an accuracy of **0.51** on the InterCode-ALFA benchmark. |
|
|
|
## Environmental Impact |
|
Experiments were conducted using a private infrastructure, which has a approximate carbon efficiency of 0.432 kgCO2eq/kWh. A cumulative of 12 hours of computation was performed on hardware of type RTX A6000 (TDP of 300W). Total emissions are estimated to be 1.56 kgCO2eq of which 0 percents were directly offset. Estimations were conducted using the [Machine Learning Emissions Calculator](https://mlco2.github.io/impact#compute). |
|
|
|
## Citation |
|
**BibTeX:** |
|
``` |
|
@misc{westenfelder2025llmsupportednaturallanguagebash, |
|
title={LLM-Supported Natural Language to Bash Translation}, |
|
author={Finnian Westenfelder and Erik Hemberg and Miguel Tulla and Stephen Moskal and Una-May O'Reilly and Silviu Chiricescu}, |
|
year={2025}, |
|
eprint={2502.06858}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL}, |
|
url={https://arxiv.org/abs/2502.06858}, |
|
} |
|
``` |
|
|
|
## Model Card Authors |
|
Finn Westenfelder |
|
|
|
## Model Card Contact |
|
Please email [email protected] or make a pull request. |