Update README.md

235ed50 verified 7 days ago

4.67 kB

	---
	license: apache-2.0
	language:
	- en
	base_model:
	- deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
	pipeline_tag: text-generation
	library_name: transformers
	tags:
	- text-generation-inference
	- code
	- math
	- error-correction
	- R1
	- 14B
	- Reasoning
	---

	![3.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/XT_cpfSE_oZyA-7AZJagT.png)

	# Canum-Venaticorum-14B-B.1

	> Canum-Venaticorum-14B-B.1 is based on the Qwen 2.5 14B modality architecture, built to significantly enhance the mathematical reasoning, coding ability, and error correction performance of 14B-parameter models. This version has been optimized for general-purpose reasoning, structured problem-solving, and intelligent assistance, offering advanced capabilities in understanding complex instructions, logical deduction, and multi-step computation.

	## Key Improvements
	1. Mathematical Reasoning Enhancements:
	Equipped with advanced capabilities in handling mathematical logic, symbolic computation, step-by-step problem-solving, and numerical accuracy across topics from basic arithmetic to higher-order mathematics.

	2. Coding and Debugging Proficiency:
	Improved performance in code generation, understanding documentation, and identifying and correcting bugs in multiple programming languages, especially Python, JavaScript, and C++. It supports functional, object-oriented, and scripting paradigms.

	3. Intelligent Error Correction:
	Capable of identifying inconsistencies or errors in logical reasoning, structured formats (JSON, XML), and code outputs, with suggestions and auto-corrections.

	4. Enhanced Instruction Following:
	Fine-tuned for following complex, nested instructions with increased precision and coherence over extended prompts and interactions.

	5. Long-Context Support:
	Supports up to 128K tokens for input context and can generate up to 8K tokens in one output, making it well-suited for extended problem solving, document generation, and analysis.

	## Quickstart with Transformers

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "prithivMLmods/Canum-Venaticorum-14B-B.1"

	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype="auto",
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained(model_name)

	prompt = "Explain the difference between breadth-first search and depth-first search with Python code examples."
	messages = [
	{"role": "system", "content": "You are a knowledgeable assistant skilled in reasoning, coding, and explanation."},
	{"role": "user", "content": prompt}
	]
	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)
	model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

	generated_ids = model.generate(
	**model_inputs,
	max_new_tokens=512
	)
	generated_ids = [
	output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
	]

	response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
	```

	## Intended Use

	1. Mathematics and Computation:
	Effective for solving math problems, verifying formulas, symbolic logic, algebraic reasoning, and analytical computations.

	2. Programming Assistance:
	Ideal for generating, explaining, and debugging code. Suitable for both learning and software development use cases.

	3. Educational and Informational Support:
	Provides accurate, well-explained answers to conceptual and applied questions in STEM and humanities.

	4. Conversational AI and Reasoning Agents:
	Designed for intelligent chatbots capable of nuanced reasoning, error correction, and structured dialogue.

	5. Multilingual & Global Applications:
	Useful for translation, multilingual support bots, and cross-lingual content generation.

	6. Long-Form & Structured Content Generation:
	Can create long documents, reports, and structured outputs like JSON, Markdown, and tabular formats.

	## Limitations

	1. Hardware Requirements:
	Demands high-memory GPUs/TPUs for optimal inference due to long-context and model size.

	2. Real-Time Limitations:
	No real-time awareness; knowledge is limited to training data.

	3. Bias and Hallucination:
	While reduced, some bias and hallucinations from training data may persist.

	4. Creative Consistency:
	Variability in outputs for creative or ambiguous queries (e.g., fiction, storytelling).

	5. Prompt Sensitivity:
	Results may vary significantly depending on the structure and clarity of the input prompt.