westenfelder
/

Qwen2.5-Coder-7B-Instruct-NL2SH

@@ -1,9 +1,137 @@
 ---
-license: mit
 library_name: transformers
-pipeline_tag: text-generation
 ---
-This repository contains the model described in [LLM-Supported Natural Language to Bash Translation](https://arxiv.org/abs/2502.06858).
-Code is available at https://github.com/westenfelder/NL2SH

 ---
 library_name: transformers
+pipeline_tag: translation
+license: mit
+datasets:
+- westenfelder/NL2SH-ALFA
+language:
+- en
+base_model: Qwen/Qwen2.5-Coder-7B-Instruct
+model-index:
+- name: Qwen2.5-Coder-7B-Instruct-NL2SH
+  results:
+  - task:
+      type: translation
+      name: Natural Language to Bash Translation
+    dataset:
+      type: translation
+      name: NL2SH-ALFA
+      split: test
+    metrics:
+      - type: accuracy
+        value: 0.51
+        name: InterCode-ALFA
+    source:
+      name: InterCode-ALFA
+      url: https://arxiv.org/abs/2502.06858
 ---
+# Model Card for Qwen2.5-Coder-7B-Instruct-NL2SH
+This model translates natural language (English) instructions to Bash commands.
+## Model Details
+### Model Description
+This model is a fine-tuned version of the [Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) model trained on the [NL2SH-ALFA](https://huggingface.co/datasets/westenfelder/NL2SH-ALFA) dataset for the task of natural language to Bash translation (NL2SH). For more information, please refer to the [paper](https://arxiv.org/abs/2502.06858).
+- **Developed by:** [Anyscale Learning For All (ALFA) Group at MIT-CSAIL](https://alfagroup.csail.mit.edu/)
+- **Language:** English
+- **License:** MIT License
+- **Finetuned from model:** [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct)
+### Model Sources
+- **Repository:** [GitHub Repo](https://github.com/westenfelder/NL2SH)
+- **Paper:** [LLM-Supported Natural Language to Bash Translation](https://arxiv.org/abs/2502.06858)
+## Uses
+### Direct Use
+This model is intended for research on machine translation. The model can also be used as an educational resource for learning Bash.
+### Out-of-Scope Use
+This model should not be used in production or automated systems without human verification.
+**Considerations for use in high-risk environments:** This model should not be used in high-risk environments due to its low accuracy and potential for generating harmful commands.
+## Bias, Risks, and Limitations
+This model has a tendency to generate overly complex and incorrect Bash commands. It may produce harmful commands that delete data or corrupt a system. This model is not intended for natural languages other than English, scripting languages or than Bash, or multi-line Bash scripts.
+### Recommendations
+Users are encouraged to use this model as Bash reference tool and should not execute commands without verification.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+```python
+import torch
+from transformers import AutoTokenizer, AutoModelForCausalLM
+def translate(prompt):
+    model_name = "westenfelder/Qwen2.5-Coder-7B-Instruct-NL2SH"
+    tokenizer = AutoTokenizer.from_pretrained(model_name, clean_up_tokenization_spaces=False)
+    model = AutoModelForCausalLM.from_pretrained(model_name, device_map="cuda", torch_dtype=torch.bfloat16)
+    messages = [
+        {"role": "system", "content": "Your task is to translate a natural language instruction to a Bash command. You will receive an instruction in English and output a Bash command that can be run in a Linux terminal."},
+        {"role": "user", "content": f"{prompt}"},
+    ]
+    tokens = tokenizer.apply_chat_template(
+        messages,
+        add_generation_prompt=True,
+        tokenize=True,
+        return_tensors="pt"
+    ).to(model.device)
+    attention_mask = torch.ones_like(tokens)
+    outputs = model.generate(
+        tokens,
+        attention_mask=attention_mask,
+        max_new_tokens=100,
+        do_sample=False,
+        temperature=None,
+        top_p=None,
+        top_k=None,
+    )
+    response = outputs[0][tokens.shape[-1]:]
+    return tokenizer.decode(response, skip_special_tokens=True)
+nl = "List files in the /workspace directory that were accessed over an hour ago."
+sh = translate(nl)
+print(sh)
+```
+## Training Details
+### Training Data
+This model was trained on the [NL2SH-ALFA](https://huggingface.co/datasets/westenfelder/NL2SH-ALFA) dataset.
+### Training Procedure
+Please refer to section 4.1 and 4.3.4 of the [paper](https://arxiv.org/abs/2502.06858) for information about data pre-processing, training hyper-parameters and hardware.
+## Evaluation
+This model was evaluated on the [NL2SH-ALFA](https://huggingface.co/datasets/westenfelder/NL2SH-ALFA) test set using the [InterCode-ALFA](https://github.com/westenfelder/InterCode-ALFA) benchmark.
+### Results
+This model achieved an accuracy of **0.51** on the InterCode-ALFA benchmark.
+## Environmental Impact
+Experiments were conducted using a private infrastructure, which has a approximate carbon efficiency of 0.432 kgCO2eq/kWh. A cumulative of 12 hours of computation was performed on hardware of type RTX A6000 (TDP of 300W). Total emissions are estimated to be 1.56 kgCO2eq of which 0 percents were directly offset. Estimations were conducted using the [Machine Learning Emissions Calculator](https://mlco2.github.io/impact#compute).
+## Citation
+**BibTeX:**
+```
+@misc{westenfelder2025llmsupportednaturallanguagebash,
+      title={LLM-Supported Natural Language to Bash Translation},
+      author={Finnian Westenfelder and Erik Hemberg and Miguel Tulla and Stephen Moskal and Una-May O'Reilly and Silviu Chiricescu},
+      year={2025},
+      eprint={2502.06858},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2502.06858},
+}
+```
+## Model Card Authors
+Finn Westenfelder
+## Model Card Contact
+Please email [email protected] or make a pull request.