Gemma 3 QAT
Collection
Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory.
•
19 items
•
Updated
•
18
This model was converted to MLX format from google/gemma-3-4b-it-qat-q4_0-unquantized
using mlx-vlm version 0.1.23.
Refer to the original model card for more details on the model.
pip install -U mlx-vlm
python -m mlx_vlm.generate --model mlx-community/gemma-3-4b-it-qat-3bit --max-tokens 100 --temperature 0.0 --prompt "Describe this image." --image <path_to_image>