Text Generation
Transformers
Safetensors
llama
text-generation-inference

Error when deploying with vllm:0.6.4.post1

#5
by marinone94 - opened

Hi,

I've tried to deploy the model using vllm:0.6.4.post1 on cuda 11.8 but I get the following error:

ValueError: As of transformers v4.44, default chat template is no longer allowed, so you must provide a chat template if the tokenizer does not define one.

Do you have any chat template to provide? Or can I use any of these examples? https://github.com/vllm-project/vllm/tree/main/examples/

Best,
Emilio

Language Technologies Unit @ Barcelona Supercomputing Center org

This model is a foundation model. It is not instruction tuned and cannot be used as a chat model. Take a look at https://huggingface.co/BSC-LT/salamandra-7b-instruct which is probably what you want (and it does have a chat template).

Sorry...

Sign up or log in to comment