Uploaded model
- Developed by: sudominoru
- License: apache-2.0
- Finetuned from model : llm-jp/llm-jp-3-13b
This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.
How to Use the LoRA Adapter
This repository contains a LoRA adapter fine-tuned for efficient inference or further fine-tuning with the base model llm-jp/llm-jp-3-13b.
Installation
Ensure you have the required libraries installed:
pip install unsloth
pip install peft
Loading the Adapter
from unsloth import FastLanguageModel
from peft import PeftModel
import torch
# Hugging Face Token
HF_TOKEN = ""
model_id = "llm-jp/llm-jp-3-13b"
adapter_id = "sudominoru/llm-jp-3-13b-it_lora"
# Load the base model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
model_name=model_id,
trust_remote_code=True,
)
# Add the LoRA adapter
model = PeftModel.from_pretrained(model, adapter_id, token = HF_TOKEN)
Running Inference
# Prepare input data
datasets = [
{"task_id": 1, "input": "Explain the importance of clean energy."},
{"task_id": 2, "input": "Translate 'How are you?' to Japanese."},
]
FastLanguageModel.for_inference(model)
results = []
for dt in tqdm(datasets):
input_text = dt["input"]
prompt = f"""### ๆ็คบ\n{input_text}\n### ๅ็ญ\n"""
# Tokenize input
inputs = tokenizer([prompt], return_tensors="pt").to(model.device)
# Generate output
outputs = model.generate(
**inputs,
max_new_tokens=512,
use_cache=True,
do_sample=False,
repetition_penalty=1.2
)
# Decode prediction
prediction = tokenizer.decode(outputs[0], skip_special_tokens=True).split('\n### ๅ็ญ')[-1]
results.append({"task_id": dt["task_id"], "input": input_text, "output": prediction})
# Print results
for result in results:
print(f"Task ID: {result['task_id']}\nInput: {result['input']}\nOutput: {result['output']}\n")
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for sudominoru/llm-jp-3-13b-it_lora
Base model
llm-jp/llm-jp-3-13b