YOYO-O1-14B-exl2

Original model: YOYO-O1-14B by YOYO-AI
Foundation model: Qwen2.5-14B by Qwen

Quants

4bpw h6 (main)
4.5bpw h6
5bpw h6
6bpw h6
8bpw h8

Quantization notes

Made with Exllamav2 0.2.8 with default dataset.
These quants can be used with TabbyAPI or Text-Generation-WebUI with RTX GPU (Windows) or RTX/ROCm (Linux).
Quants have to fit your VRAM, if you need RAM offloading then choose GGUF quants instead.

Original model card

YOYO-O1-14B

Combined the most top-notch 14B inference model and code model in the entire open-source community.

Merge Method

This model was merged using the SCE merge method using Qwen/Qwen2.5-Coder-14B as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

merge_method: sce
models:
  # Pivot model
  - model: Qwen/Qwen2.5-Coder-14B
  # Target models
  - model: agentica-org/DeepCoder-14B-Preview
  - model: qihoo360/Light-R1-14B-DS
  - model: Gen-Verse/ReasonFlux-F1-14B
  - model: Qwen/Qwen2.5-Coder-14B-Instruct
base_model: Qwen/Qwen2.5-Coder-14B
parameters:
  select_topk: 1
dtype: float16
tokenizer_source: qihoo360/Light-R1-14B-DS
normalize: true
int8_mask: true

cgus
/

YOYO-O1-14B-exl2