Cuda error

#2
by Mezigore - opened

Error
Failed to generate text: FlashAttention2 has been toggled on, but it cannot be used due to the following error: Flash Attention 2 is not available on CPU. Please make sure torch can access a CUDA device.

From the page, it seems that the model is built on a CPU device. Has anyone tried running the Gradio app on a local machine with a GPU?

Moonshot AI org

It is fixed. Try it now!

teowu changed discussion status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment