prithivMLmods commited on
Commit
622df32
·
verified ·
1 Parent(s): 2c025b4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +78 -1
README.md CHANGED
@@ -8,4 +8,81 @@ base_model:
8
  - Qwen/Qwen2.5-1.5B-Instruct
9
  pipeline_tag: text-generation
10
  ---
11
- Monoceros-QwenM-1.5B is a chain-of-thought model fine-tuned from Qwen-1.5B, designed for solving math problems in English and Chinese.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  - Qwen/Qwen2.5-1.5B-Instruct
9
  pipeline_tag: text-generation
10
  ---
11
+
12
+ ![M.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/JXIomwktKoqTBjJQNy3rj.png)
13
+
14
+ # **Monoceros-QwenM-1.5B**
15
+
16
+ > **Monoceros-QwenM-1.5B** is a **chain-of-thought reasoning model** fine-tuned from **Qwen-1.5B**, specifically designed for solving **mathematical problems** in both **English** and **Chinese**. It brings advanced reasoning and step-by-step problem-solving capabilities in a compact size, ideal for educational tools, tutoring systems, and math-focused assistants.
17
+
18
+ ## **Key Features**
19
+
20
+ 1. **Chain-of-Thought Math Reasoning**
21
+ Trained to produce intermediate steps for math problems, Monoceros-QwenM-1.5B enables interpretability and transparent logic in answers — critical for educational and verification purposes.
22
+
23
+ 2. **Bilingual Proficiency (English + Chinese)**
24
+ Capable of understanding, reasoning, and explaining math problems fluently in **both English and Simplified Chinese**, making it suitable for multilingual learning environments.
25
+
26
+ 3. **Compact yet Capable**
27
+ While only 1.5B parameters, this model delivers strong performance for arithmetic, algebra, geometry, word problems, and logical puzzles with minimal resource demands.
28
+
29
+ 4. **Step-by-Step Computation**
30
+ Provides structured, multi-step answers that mirror human-like problem solving, making it easy to follow and learn from.
31
+
32
+ ## **Quickstart with Transformers**
33
+
34
+ ```python
35
+ from transformers import AutoModelForCausalLM, AutoTokenizer
36
+
37
+ model_name = "prithivMLmods/Monoceros-QwenM-1.5B"
38
+
39
+ model = AutoModelForCausalLM.from_pretrained(
40
+ model_name,
41
+ torch_dtype="auto",
42
+ device_map="auto"
43
+ )
44
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
45
+
46
+ prompt = "Solve: A train travels 180 km in 3 hours. What is its average speed?"
47
+ messages = [
48
+ {"role": "system", "content": "You are a helpful tutor skilled in solving math problems with step-by-step explanations."},
49
+ {"role": "user", "content": prompt}
50
+ ]
51
+ text = tokenizer.apply_chat_template(
52
+ messages,
53
+ tokenize=False,
54
+ add_generation_prompt=True
55
+ )
56
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
57
+
58
+ generated_ids = model.generate(
59
+ **model_inputs,
60
+ max_new_tokens=512
61
+ )
62
+ generated_ids = [
63
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
64
+ ]
65
+
66
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
67
+ ```
68
+
69
+ ## **Intended Use**
70
+
71
+ - **Math Tutoring Bots**: Step-by-step solvers for students across basic to intermediate levels.
72
+ - **Bilingual Educational Apps**: Teaching math in **English** and **Chinese**, improving accessibility.
73
+ - **STEM Reasoning Tools**: Reasoning for science, engineering, and logic-based problems.
74
+ - **Lightweight LLM Applications**: Embedded use cases in browsers, mobile apps, or low-resource environments.
75
+
76
+ ## **Limitations**
77
+
78
+ 1. **Limited Domain Generalization**:
79
+ Optimized for math; performance may drop in creative writing, casual conversation, or unrelated topics.
80
+
81
+ 2. **Parameter Scale**:
82
+ Though efficient, it may underperform compared to larger models on highly complex or abstract math.
83
+
84
+ 3. **Bias from Base Model**:
85
+ Inherits any biases from Qwen-1.5B’s pretraining. Outputs should be validated in sensitive settings.
86
+
87
+ 4. **Prompt Sensitivity**:
88
+ Precise, structured input yields better stepwise results.