Quantized using GPTQModel
quantiziation config:
quant_config = QuantizeConfig( bits=8, group_size=32, sym=True, desc_act=False, true_sequential=True, pack_dtype=torch.int32, damp_percent=0.1 )
Chat template
Files info
Base model