Phi-4 model loads successfully on text-generation-webui, but Phi-4-mini-instruct does not

#21
by harisnaeem - opened

I'm using the text-generation-webui latest release on my computer with CPU mode and it is able to successfully load both '4-bit' and '2-bit' GGUF of the Phi-4 model without any problems.

Screenshot from 2025-03-13 16-20-04_phi-4-Q2_K.png

however it does not seem to load any GGUF of Phi-4-mini-instruct

Screenshot from 2025-03-13 16-22-11_Phi-4-mini-instruct-Q6_K.png

Can someone who uses the text-generation-webui tell me what the problem with phi-4-mini-instruct might be?

it uses gpt-4o as pre tokenizer, where the included llama.cpp version didnt support. theres no update for the upstream repo https://github.com/jllllll/llama-cpp-python-cuBLAS-wheels.
can solve by updating llama cpp according to your environment.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment