Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,7 @@
|
|
1 |
---
|
2 |
base_model: JetBrains/Mellum-4b-base
|
|
|
|
|
3 |
tags:
|
4 |
- text-generation-inference
|
5 |
- transformers
|
@@ -7,17 +9,66 @@ tags:
|
|
7 |
- llama
|
8 |
- trl
|
9 |
- sft
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
license: apache-2.0
|
11 |
language:
|
12 |
- en
|
|
|
|
|
|
|
|
|
13 |
---
|
|
|
14 |
|
15 |
-
|
16 |
|
17 |
-
-
|
18 |
-
- **License:** apache-2.0
|
19 |
-
- **Finetuned from model :** JetBrains/Mellum-4b-base
|
20 |
|
21 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
|
23 |
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
|
|
|
1 |
---
|
2 |
base_model: JetBrains/Mellum-4b-base
|
3 |
+
datasets:
|
4 |
+
- Etherll/CodeFIM-Rust-Mellum
|
5 |
tags:
|
6 |
- text-generation-inference
|
7 |
- transformers
|
|
|
9 |
- llama
|
10 |
- trl
|
11 |
- sft
|
12 |
+
- code
|
13 |
+
- rust
|
14 |
+
- fill-in-the-middle
|
15 |
+
- fim
|
16 |
+
- text-generation
|
17 |
+
- llm
|
18 |
license: apache-2.0
|
19 |
language:
|
20 |
- en
|
21 |
+
library_name: transformers
|
22 |
+
model-index:
|
23 |
+
- name: Etherll/Mellum-4b-sft-rust
|
24 |
+
results: []
|
25 |
---
|
26 |
+
# Etherll/Mellum-4b-sft-rust
|
27 |
|
28 |
+
**Etherll/Mellum-4b-sft-rust** is a large language model (LLM) fine-tuned specifically for **Rust code Fill-in-the-Middle (FIM)** tasks. It is built upon `JetBrains/Mellum-4b-base` model.
|
29 |
|
30 |
+
This model has been fine-tuned on the `Etherll/CodeFIM-Rust-Mellum` dataset, which comprises approximately 57,000 Rust-specific FIM examples, to enhance its proficiency in completing Rust code snippets accurately and contextually.
|
|
|
|
|
31 |
|
32 |
+
A GGUF version for CPU inference is also available: [Etherll/Mellum-4b-sft-rust-GGUF](https://huggingface.co/Etherll/Mellum-4b-sft-rust-GGUF).
|
33 |
+
|
34 |
+
## Model Description
|
35 |
+
|
36 |
+
This model leverages the LLaMA-style architecture of `Mellum-4b-base` (4 billion parameters) and its extensive pre-training on over 4 trillion tokens. The fine-tuning process focused on adapting the model to the nuances of Rust syntax and common coding patterns for FIM tasks.
|
37 |
+
|
38 |
+
**Key Features:**
|
39 |
+
* **Specialized for Rust:** Optimized for Fill-in-the-Middle tasks in Rust.
|
40 |
+
* **Based on Mellum-4b-base:** Benefits from JetBrains' robust base model.
|
41 |
+
* **Efficient:** Suitable for both cloud and local deployment.
|
42 |
+
* **IDE Integration Ready:** Designed for use in developer tooling, and works particularly well with [Continue.dev](https://www.continue.dev/) for an enhanced coding assistant experience.
|
43 |
+
|
44 |
+
## Fine-tuning Data
|
45 |
+
* **Dataset:** `Etherll/CodeFIM-Rust-Mellum`
|
46 |
+
* **Size:** ~57,000 rows
|
47 |
+
* **Focus:** Rust code Fill-in-the-Middle
|
48 |
+
|
49 |
+
## FIM Format
|
50 |
+
|
51 |
+
This model is trained to recognize a specific format for Fill-in-the-Middle tasks. When providing input for FIM, please use the following structure:
|
52 |
+
|
53 |
+
```
|
54 |
+
<filename>{{{filename}}}
|
55 |
+
<fim_suffix>{{{suffix_code}}}<fim_prefix>{{{prefix_code}}}<fim_middle>
|
56 |
+
```
|
57 |
+
|
58 |
+
## How to Use
|
59 |
+
|
60 |
+
## With Continue.dev
|
61 |
+
|
62 |
+
For the best integrated development experience, it's highly recommended to use this model with [Continue.dev](https://www.continue.dev/).
|
63 |
+
|
64 |
+
Refer to the [Continue.dev documentation](https://www.continue.dev/docs/setup/overview) for instructions on how to add custom LLMs.
|
65 |
+
|
66 |
+
### GGUF Version
|
67 |
+
|
68 |
+
A GGUF version is available at [Etherll/Mellum-4b-sft-rust-GGUF](https://huggingface.co/Etherll/Mellum-4b-sft-rust-GGUF).
|
69 |
+
This format is suitable for local inference on CPU (and GPU with appropriate llama.cpp/Ollama builds) using tools like:
|
70 |
+
* [llama.cpp](https://github.com/ggerganov/llama.cpp)
|
71 |
+
* [Ollama](https://ollama.ai/)
|
72 |
+
* [LM Studio](https://lmstudio.ai/)
|
73 |
|
74 |
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
|