TheBloke
/

Falcon-7B-Instruct-GPTQ

Text Generation

RefinedWebModel

text-generation-inference

4-bit precision

Model card Files Files and versions Community

Resources

View closed (9)

RuntimeError: Expected query, key, and value to have the same dtype, but got query.dtype: float key.dtype: float and value.dtype: c10::Half instead.

#19 opened over 1 year ago by

The model 'RWGPTQForCausalLM' is not supported for text-generation.

#18 opened over 1 year ago by

Model not working for CPU

#17 opened over 1 year ago by

ValueError: Unrecognized configuration class

#14 opened almost 2 years ago by

Can't use with tgi. Getting `RuntimeError: weight transformer.h.0.self_attention.query_key_value.weight does not exist`

#12 opened almost 2 years ago by

Integration to transformers pipeline

#10 opened almost 2 years ago by

clementdesroches

Custom 4-bit Finetuning 5-7 times faster inference than QLora

#5 opened almost 2 years ago by

Getting 0 tokens while running using text-generation -webui

#4 opened almost 2 years ago by

CUDA extension not installed

#3 opened almost 2 years ago by

Do you know anything about this error?

#2 opened almost 2 years ago by