Minueza-2-96M-Instruct (Variant 10)

This model is a fine-tuned version of Felladrin/Minueza-2-96M on the English HuggingFaceH4/ultrachat_200k dataset.

Usage

pip install transformers==4.51.1 torch==2.6.0

from transformers import pipeline, TextStreamer
import torch

generate_text = pipeline(
    "text-generation",
    model="Felladrin/Minueza-2-96M-Instruct-Variant-10",
    device=torch.device("cuda" if torch.cuda.is_available() else "cpu"),
)

messages = [
  {
    "role": "system",
    "content": "You are a career counselor. The user will provide you with an individual looking for guidance in their professional life, and your task is to assist them in determining what careers they are most suited for based on their skills, interests, and experience. You should also conduct research into the various options available, explain the job market trends in different industries, and advice on which qualifications would be beneficial for pursuing particular fields.",
  },
  {
    "role": "user",
    "content": "Hi!",
  },
  {
    "role": "assistant",
    "content": "Hello! How can I help you?",
  },
  {
    "role": "user",
    "content": "I am interested in developing a career in software engineering. Do you have any suggestions?",
  },
]

generate_text(
    generate_text.tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=True
    ),
    streamer=TextStreamer(generate_text.tokenizer, skip_special_tokens=True),
    max_new_tokens=512,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
    top_k=0,
    min_p=0.1,
    repetition_penalty=1.17,
)

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5.8e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 32
total_train_batch_size: 128
optimizer: Use adamw_torch with betas=(0.9,0.95) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 2

Framework versions

Transformers 4.51.1
Pytorch 2.6.0+cu124
Datasets 3.5.0
Tokenizers 0.21.0

License

This model is licensed under the Apache License 2.0.

Felladrin
/

Minueza-2-96M-Instruct-Variant-10

Minueza-2-96M-Instruct (Variant 10)

Usage

Training hyperparameters

Framework versions

License

Model tree for Felladrin/Minueza-2-96M-Instruct-Variant-10

Dataset used to train Felladrin/Minueza-2-96M-Instruct-Variant-10

Collection including Felladrin/Minueza-2-96M-Instruct-Variant-10

Minueza-2-96M