πŸ“Œ Overview

A 4-bit MLX quantized version of Qwen3-30Bβ€”A6B optimized for efficient inference using the MLX library, designed to handle long-context tasks (192k tokens) with reduced resource usage. Retains core capabilities of Qwen3 while enabling deployment on edge devices.

Downloads last month
451
Safetensors
Model size
4.77B params
Tensor type
BF16
Β·
U32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Goraint/Qwen3-30B-A6B-16-Extreme-128k-context-MLX-RTN-4bit

Quantized
(3)
this model