huggingface_hub transformers accelerate optimum-quanto outlines # --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu121 # llama-cpp-python==0.3.4