Model Card for Qwen2.5-Coder-7B-Instruct-NL2SH
This model translates natural language (English) instructions to Bash commands.
Model Details
Model Description
This model is a fine-tuned version of the Qwen2.5-Coder-7B-Instruct model trained on the NL2SH-ALFA dataset for the task of natural language to Bash translation (NL2SH). For more information, please refer to the paper.
- Developed by: Anyscale Learning For All (ALFA) Group at MIT-CSAIL
- Language: English
- License: MIT License
- Finetuned from model: Qwen/Qwen2.5-Coder-7B-Instruct
Model Sources
- Repository: GitHub Repo
- Paper: LLM-Supported Natural Language to Bash Translation
Uses
Direct Use
This model is intended for research on machine translation. The model can also be used as an educational resource for learning Bash.
Out-of-Scope Use
This model should not be used in production or automated systems without human verification.
Considerations for use in high-risk environments: This model should not be used in high-risk environments due to its low accuracy and potential for generating harmful commands.
Bias, Risks, and Limitations
This model has a tendency to generate overly complex and incorrect Bash commands. It may produce harmful commands that delete data or corrupt a system. This model is not intended for natural languages other than English, scripting languages or than Bash, or multi-line Bash scripts.
Recommendations
Users are encouraged to use this model as Bash reference tool and should not execute commands without verification.
How to Get Started with the Model
Use the code below to get started with the model.
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
def translate(prompt):
model_name = "westenfelder/Qwen2.5-Coder-7B-Instruct-NL2SH"
tokenizer = AutoTokenizer.from_pretrained(model_name, clean_up_tokenization_spaces=False)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="cuda", torch_dtype=torch.bfloat16)
messages = [
{"role": "system", "content": "Your task is to translate a natural language instruction to a Bash command. You will receive an instruction in English and output a Bash command that can be run in a Linux terminal."},
{"role": "user", "content": f"{prompt}"},
]
tokens = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_tensors="pt"
).to(model.device)
attention_mask = torch.ones_like(tokens)
outputs = model.generate(
tokens,
attention_mask=attention_mask,
max_new_tokens=100,
do_sample=False,
temperature=None,
top_p=None,
top_k=None,
)
response = outputs[0][tokens.shape[-1]:]
return tokenizer.decode(response, skip_special_tokens=True)
nl = "List files in the /workspace directory that were accessed over an hour ago."
sh = translate(nl)
print(sh)
Training Details
Training Data
This model was trained on the NL2SH-ALFA dataset.
Training Procedure
Please refer to section 4.1 and 4.3.4 of the paper for information about data pre-processing, training hyper-parameters and hardware.
Evaluation
This model was evaluated on the NL2SH-ALFA test set using the InterCode-ALFA benchmark.
Results
This model achieved an accuracy of 0.51 on the InterCode-ALFA benchmark.
Environmental Impact
Experiments were conducted using a private infrastructure, which has a approximate carbon efficiency of 0.432 kgCO2eq/kWh. A cumulative of 12 hours of computation was performed on hardware of type RTX A6000 (TDP of 300W). Total emissions are estimated to be 1.56 kgCO2eq of which 0 percents were directly offset. Estimations were conducted using the Machine Learning Emissions Calculator.
Citation
BibTeX:
@misc{westenfelder2025llmsupportednaturallanguagebash,
title={LLM-Supported Natural Language to Bash Translation},
author={Finnian Westenfelder and Erik Hemberg and Miguel Tulla and Stephen Moskal and Una-May O'Reilly and Silviu Chiricescu},
year={2025},
eprint={2502.06858},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.06858},
}
Model Card Authors
Finn Westenfelder
Model Card Contact
Please email [email protected] or make a pull request.
- Downloads last month
- 5
Model tree for westenfelder/Qwen2.5-Coder-7B-Instruct-NL2SH
Dataset used to train westenfelder/Qwen2.5-Coder-7B-Instruct-NL2SH
Collection including westenfelder/Qwen2.5-Coder-7B-Instruct-NL2SH
Evaluation results
- InterCode-ALFA on NL2SH-ALFAtest set InterCode-ALFA0.510