mlx-community
/

gemma-3-4b-it-qat-3bit

Image-Text-to-Text

text-generation-inference

Model card Files Files and versions Community

mlx-community/gemma-3-4b-it-qat-3bit

This model was converted to MLX format from google/gemma-3-4b-it-qat-q4_0-unquantized using mlx-vlm version 0.1.23. Refer to the original model card for more details on the model.

Use with mlx

pip install -U mlx-vlm

python -m mlx_vlm.generate --model mlx-community/gemma-3-4b-it-qat-3bit --max-tokens 100 --temperature 0.0 --prompt "Describe this image." --image <path_to_image>

Downloads last month: 34

Safetensors

Model size

989M params

Tensor type

FP16

·

U32

·

Inference Providers NEW

Image-Text-to-Text

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mlx-community/gemma-3-4b-it-qat-3bit

Base model

google/gemma-3-4b-pt

Finetuned

google/gemma-3-4b-it

Finetuned

(97)

this model

Collection including mlx-community/gemma-3-4b-it-qat-3bit

Gemma 3 QAT

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory. • 19 items • Updated 3 days ago • 18