EduMixtral-4x7B / README.md
AdamLucek's picture
Update README.md
1cbb9aa verified
---
license: cc-by-nc-4.0
tags:
- moe
- frankenmoe
- merge
- mergekit
- lazymergekit
base_model:
- mlabonne/NeuralDaredevil-7B
- BioMistral/BioMistral-7B
- mistralai/Mathstral-7B-v0.1
- FPHam/Writing_Partner_Mistral_7B
library_name: transformers
pipeline_tag: text-generation
---
# EduMixtral-4x7B
<img src="https://cdn-uploads.huggingface.co/production/uploads/65ba68a15d2ef0a4b2c892b4/1hvgYltQRmbkzHMSXvGYh.jpeg" width=400>
EduMixtral-4x7B is an experimental model that combines different educational focused language models intended for downstream human/ai student/teacher application research.
Intended to cover: general knowledge, medical field, math, and writing assistance.
## 🤏 Models Merged
EduMixtral-4x7B is a Mixture of Experts (MoE) made with the following models using [Mergekit](https://github.com/arcee-ai/mergekit):
* [mlabonne/NeuralDaredevil-7B](https://huggingface.co/mlabonne/NeuralDaredevil-7B) <- Base Model
* [BioMistral/BioMistral-7B](https://huggingface.co/BioMistral/BioMistral-7B)
* [mistralai/Mathstral-7B-v0.1](https://huggingface.co/mistralai/Mathstral-7B-v0.1)
* [FPHam/Writing_Partner_Mistral_7B](https://huggingface.co/FPHam/Writing_Partner_Mistral_7B)
## 🧩 Configuration
```yaml
base_model: mlabonne/NeuralDaredevil-7B
gate_mode: hidden
experts:
- source_model: mlabonne/NeuralDaredevil-7B
positive_prompts:
- "hello"
- "help"
- "question"
- "explain"
- "information"
- source_model: BioMistral/BioMistral-7B
positive_prompts:
- "medical"
- "health"
- "biomedical"
- "clinical"
- "anatomy"
- source_model: mistralai/Mathstral-7B-v0.1
positive_prompts:
- "math"
- "calculation"
- "equation"
- "geometry"
- "algebra"
- source_model: FPHam/Writing_Partner_Mistral_7B
positive_prompts:
- "writing"
- "creative process"
- "story structure"
- "character development"
- "plot"
```
## 💻 Usage
It is reccomended to load in 8bit or 4bit quantization
```python
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("AdamLucek/EduMixtral-4x7B")
model = AutoModelForCausalLM.from_pretrained(
"AdamLucek/EduMixtral-4x7B",
device_map="cuda",
quantization_config=BitsAndBytesConfig(load_in_8bit=True)
)
# Prepare the input text
input_text = "Math problem: Xiaoli reads a 240-page story book. She reads (1/8) of the whole book on the first day and (1/5) of the whole book on the second day. How many pages did she read in total in two days?"
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
# Generate the output with specified parameters
outputs = model.generate(
**input_ids,
max_new_tokens=256,
num_return_sequences=1
)
# Decode and print the generated text
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
**Output:**
>Solution:
>To find the total number of pages Xiaoli read in two days, we need to add the number of pages she read on the first day and the second day.
>On the first day, Xiaoli read 1/8 of the book. Since the book has 240 pages, the number of pages she read on the first day is:
>\[ \frac{1}{8} \times 240 = 30 \text{ pages} \]
>On the second day, Xiaoli read 1/5 of the book. The number of pages she read on the second day is:
>\[ \frac{1}{5} \times 240 = 48 \text{ pages} \]
>To find the total number of pages she read in two days, we add the pages she read on the first day and the second day:
>\[ 30 \text{ pages} + 48 \text{ pages} = 78 \text{ pages} \]
>Therefore, Xiaoli read a total of 78 pages in two days.
>Final answer: Xiaoli read 78 pages in total