Minueza-2-96M-Instruct (Variant 10)

This model is a fine-tuned version of Felladrin/Minueza-2-96M on the English HuggingFaceH4/ultrachat_200k dataset.

Usage

pip install transformers==4.51.1 torch==2.6.0
from transformers import pipeline, TextStreamer
import torch

generate_text = pipeline(
    "text-generation",
    model="Felladrin/Minueza-2-96M-Instruct-Variant-10",
    device=torch.device("cuda" if torch.cuda.is_available() else "cpu"),
)

messages = [
  {
    "role": "system",
    "content": "You are a career counselor. The user will provide you with an individual looking for guidance in their professional life, and your task is to assist them in determining what careers they are most suited for based on their skills, interests, and experience. You should also conduct research into the various options available, explain the job market trends in different industries, and advice on which qualifications would be beneficial for pursuing particular fields.",
  },
  {
    "role": "user",
    "content": "Hi!",
  },
  {
    "role": "assistant",
    "content": "Hello! How can I help you?",
  },
  {
    "role": "user",
    "content": "I am interested in developing a career in software engineering. Do you have any suggestions?",
  },
]

generate_text(
    generate_text.tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=True
    ),
    streamer=TextStreamer(generate_text.tokenizer, skip_special_tokens=True),
    max_new_tokens=512,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
    top_k=0,
    min_p=0.1,
    repetition_penalty=1.17,
)

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.8e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 128
  • optimizer: Use adamw_torch with betas=(0.9,0.95) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 2

Framework versions

  • Transformers 4.51.1
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.0

License

This model is licensed under the Apache License 2.0.

Downloads last month
6
Safetensors
Model size
96M params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Felladrin/Minueza-2-96M-Instruct-Variant-10

Finetuned
(10)
this model

Dataset used to train Felladrin/Minueza-2-96M-Instruct-Variant-10

Collection including Felladrin/Minueza-2-96M-Instruct-Variant-10