File size: 3,187 Bytes
d1a86e3 d4ecbd0 0207b42 f4c0dbb 8636c9b 76ea2a9 d4ecbd0 1ca276c 499fe4e c369292 d4ecbd0 76ea2a9 d4ecbd0 0207b42 d4ecbd0 0207b42 d4ecbd0 0207b42 d4ecbd0 1ca276c d4ecbd0 1ca276c d4ecbd0 c369292 0207b42 6b50c8b 5649c88 c369292 d1a86e3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 |
---
base_model: llm-jp/llm-jp-3-13b
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
license: apache-2.0
language:
- en
---
# How to Run this Model
基本的にhugging face modelとしてloadすればOK。
**elyza-tasks-100-TV_0.jsonl を事前に同じフォルダーに置いてください。**
**HF_TOKENの入れ替えを忘れないでください**
環境準備
```
!pip install -U bitsandbytes
!pip install -U transformers
!pip install -U accelerate
!pip install -U datasets
```
結果jsonlを作成ためのコード例
推論結果が llm-jp-3-13b-it-outputs.jsonl に作成される
```
from transformers import (
AutoModelForCausalLM,
AutoTokenizer,
BitsAndBytesConfig,
)
import torch
from tqdm import tqdm
import json
HF_TOKEN = ADD YOUR OWN TOKEN
model_name = "AlHfac/llm-jp-3-13b-it"
# QLoRA config
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=False,
)
# Load model
model = AutoModelForCausalLM.from_pretrained(
model_name,
quantization_config=bnb_config,
device_map="auto",
token = HF_TOKEN
)
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True, token = HF_TOKEN)
# Load Questions
datasets = []
with open("./elyza-tasks-100-TV_0.jsonl", "r") as f:
item = ""
for line in f:
line = line.strip()
item += line
if item.endswith("}"):
datasets.append(json.loads(item))
item = ""
# Generate results using loaded model
results = []
for data in tqdm(datasets):
input = data["input"]
prompt = f"""### 指示
{input}
### 回答:
"""
tokenized_input = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
tokenized_input,
max_new_tokens=100,
do_sample=False,
repetition_penalty=1.2
)[0]
output = tokenizer.decode(outputs[tokenized_input.size(1):], skip_special_tokens=True)
results.append({"task_id": data["task_id"], "input": input, "output": output})
# Generate jsonl
import re
model_name = re.sub(".*/", "", model_name)
with open(f"./{model_name}-outputs.jsonl", 'w', encoding='utf-8') as f:
for result in results:
json.dump(result, f, ensure_ascii=False) # ensure_ascii=False for handling non-ASCII characters
f.write('\n')
```
`The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. Setting pad_token_id to eos_token_id:2 for open-end generation.`
ようなlogを無視してもOK
# Model Training Information
- **Developed by:** AlHfac
- **License:** apache-2.0
- **Finetuned from model :** llm-jp/llm-jp-3-13b
This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
|