configuration_bitnet.py missing

#4
by lefromage - opened

OSError: microsoft/bitnet-b1.58-2B-4T does not appear to have a file named configuration_bitnet.py. Checkout 'https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/tree/main'for available files.

same issue

Microsoft org

To help me understand the issue better, could you please provide more information about how you're using the model? Specifically, it would be helpful to see the relevant code snippets or scripts.
Also, to ensure we're on the same page, can you confirm that you're using the version of transformers installed directly from the GitHub repository using this command?

pip install git+https://github.com/shumingma/transformers.git
Microsoft org

Besides the version of transformers above, please ensure NOT to use trust_remote_code=True from the from_pretrained call.

@frontierai same issue i also facing I am running using docker like below commends

docker run --rm \
             --privileged=true \
             --shm-size=4g \
             -p 8000:8000 \
             vllm-cpu-env \
             --model=microsoft/bitnet-b1.58-2B-4T \
             --trust-remote-code
             --dtype=bfloat16 
      ```

from transformers import AutoModel

model = AutoModel.from_pretrained("microsoft/bitnet-b1.58-2B-4T")

usr/local/lib/python3.11/dist-packages/huggingface_hub/file_download.py:896: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_auth.py:94: UserWarning:
The secret HF_TOKEN does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
warnings.warn(
The repository for microsoft/bitnet-b1.58-2B-4T contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co/microsoft/bitnet-b1.58-2B-4T.
You can avoid this prompt in future by passing the argument trust_remote_code=True.

Do you wish to run the custom code? [y/N] y
Could not locate the configuration_bitnet.py inside microsoft/bitnet-b1.58-2B-4T.

HTTPError Traceback (most recent call last)
/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_http.py in hf_raise_for_status(response, endpoint_name)
408 try:
--> 409 response.raise_for_status()
410 except HTTPError as e:

16 frames
HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/resolve/main/configuration_bitnet.py

The above exception was the direct cause of the following exception:

EntryNotFoundError Traceback (most recent call last)
EntryNotFoundError: 404 Client Error. (Request ID: Root=1-68042f0b-64d70ec2467336b62863647f;8c653f3b-556a-46e1-b14c-3d329e75cbf6)

Entry Not Found for url: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/resolve/main/configuration_bitnet.py.

The above exception was the direct cause of the following exception:

OSError Traceback (most recent call last)
/usr/local/lib/python3.11/dist-packages/transformers/utils/hub.py in cached_file(path_or_repo_id, filename, cache_dir, force_download, resume_download, proxies, token, revision, local_files_only, subfolder, repo_type, user_agent, _raise_exceptions_for_gated_repo, _raise_exceptions_for_missing_entries, _raise_exceptions_for_connection_errors, _commit_hash, **deprecated_kwargs)
450 if revision is None:
451 revision = "main"
--> 452 raise EnvironmentError(
453 f"{path_or_repo_id} does not appear to have a file named {full_filename}. Checkout "
454 f"'https://huggingface.co/{path_or_repo_id}/{revision}' for available files."

OSError: microsoft/bitnet-b1.58-2B-4T does not appear to have a file named configuration_bitnet.py. Checkout 'https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/main' for available files.

!pip show transformers

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model = AutoModelForCausalLM.from_pretrained("microsoft/bitnet-b1.58-2B-4T", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("microsoft/bitnet-b1.58-2B-4T")

input_text = "Daniel went back to the the the garden. Mary travelled to the kitchen. Sandra journeyed to the kitchen. Sandra went to the hallway. John went to the bedroom. Mary went back to the garden. Where is Mary?\nAnswer:"

The change is here: remove .cuda() to keep the tensor on CPU

input_ids = tokenizer.encode(input_text, return_tensors="pt")
output = model.generate(input_ids, max_length=50, do_sample=False)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

Name: transformers
Version: 4.40.0
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: [email protected]
License: Apache 2.0 License
Location: /usr/local/lib/python3.11/dist-packages
Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, safetensors, tokenizers, tqdm
Required-by: compressed-tensors, peft, sentence-transformers, vllm, xgrammar, zetascale
/usr/local/lib/python3.11/dist-packages/huggingface_hub/file_download.py:896: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn(
Could not locate the configuration_bitnet.py inside microsoft/bitnet-b1.58-2B-4T.

HTTPError Traceback (most recent call last)
/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_http.py in hf_raise_for_status(response, endpoint_name)
408 try:
--> 409 response.raise_for_status()
410 except HTTPError as e:

16 frames
HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/resolve/main/configuration_bitnet.py

The above exception was the direct cause of the following exception:

EntryNotFoundError Traceback (most recent call last)
EntryNotFoundError: 404 Client Error. (Request ID: Root=1-68042fb5-6980da7d29f0e7976db01b6d;75efdf5b-f74c-41d3-a09c-112b138d5457)

Entry Not Found for url: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/resolve/main/configuration_bitnet.py.

The above exception was the direct cause of the following exception:

OSError Traceback (most recent call last)
/usr/local/lib/python3.11/dist-packages/transformers/utils/hub.py in cached_file(path_or_repo_id, filename, cache_dir, force_download, resume_download, proxies, token, revision, local_files_only, subfolder, repo_type, user_agent, _raise_exceptions_for_gated_repo, _raise_exceptions_for_missing_entries, _raise_exceptions_for_connection_errors, _commit_hash, **deprecated_kwargs)
450 if revision is None:
451 revision = "main"
--> 452 raise EnvironmentError(
453 f"{path_or_repo_id} does not appear to have a file named {full_filename}. Checkout "
454 f"'https://huggingface.co/{path_or_repo_id}/{revision}' for available files."

OSError: microsoft/bitnet-b1.58-2B-4T does not appear to have a file named configuration_bitnet.py. Checkout 'https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/main' for available files.

from transformers import AutoModel
model = AutoModel.from_pretrained("microsoft/bitnet-b1.58-2B-4T", trust_remote_code=True)

OSError: microsoft/bitnet-b1.58-2B-4T does not appear to have a file named configuration_bitnet.py. Checkout 'https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/main' for available files

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "microsoft/bitnet-b1.58-2B-4T"

Load tokenizer and model

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
)

Apply the chat template

messages = [
{"role": "system", "content": "You are a helpful AI assistant."},
{"role": "user", "content": "How are you?"},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
chat_input = tokenizer(prompt, return_tensors="pt").to(model.device)

Generate response

chat_outputs = model.generate(**chat_input, max_new_tokens=11)
response = tokenizer.decode(chat_outputs[0][chat_input['input_ids'].shape[-1]:], skip_special_tokens=True) # Decode only the response part
print("\nAssistant Response:", response)

The repository for microsoft/bitnet-b1.58-2B-4T contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co/microsoft/bitnet-b1.58-2B-4T.
You can avoid this prompt in future by passing the argument trust_remote_code=True.

Do you wish to run the custom code? [y/N] y
Could not locate the configuration_bitnet.py inside microsoft/bitnet-b1.58-2B-4T.

OSError Traceback (most recent call last)
in <cell line: 0>()
6 # Load tokenizer and model
7 tokenizer = AutoTokenizer.from_pretrained(model_id)
----> 8 model = AutoModelForCausalLM.from_pretrained(
9 model_id,
10 )

5 frames
/usr/local/lib/python3.11/dist-packages/transformers/utils/hub.py in cached_files(path_or_repo_id, filenames, cache_dir, force_download, resume_download, proxies, token, revision, local_files_only, subfolder, repo_type, user_agent, _raise_exceptions_for_gated_repo, _raise_exceptions_for_missing_entries, _raise_exceptions_for_connection_errors, commit_hash, **deprecated_kwargs)
515 f"a file named {missing_entries[0]}" if len(missing_entries) == 1 else f"files named {(*missing_entries,)}"
516 )
--> 517 raise EnvironmentError(
518 f"{path_or_repo_id} does not appear to have {msg}. Checkout 'https://huggingface.co/{path_or_repo_id}/tree/{revision
}'"
519 "for available files."

OSError: microsoft/bitnet-b1.58-2B-4T does not appear to have a file named configuration_bitnet.py. Checkout 'https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/tree/main'for available files.

Successfully installed gguf-0.16.2 numpy-1.26.4 protobuf-4.25.6 tokenizers-0.21.1 torch-2.2.2+cpu transformers-4.51.3

Successfully installed numpy-1.25.2

!pip install git+https://github.com/shumingma/transformers.git

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "microsoft/bitnet-b1.58-2B-4T"

Load tokenizer and model

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
)

Apply the chat template

messages = [
{"role": "system", "content": "You are a helpful AI assistant."},
{"role": "user", "content": "How are you?"},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
chat_input = tokenizer(prompt, return_tensors="pt").to(model.device)

Generate response

chat_outputs = model.generate(**chat_input, max_new_tokens=11)
response = tokenizer.decode(chat_outputs[0][chat_input['input_ids'].shape[-1]:], skip_special_tokens=True) # Decode only the response part
print("\nAssistant Response:", response)

RuntimeError: Failed to import transformers.models.bitnet.modeling_bitnet because of the following error (look up to see its traceback):
module 'torch.library' has no attribute 'register_fake'

%cd tests/quantization/bitnet_integration
!python /content/transformers/tests/quantization/bitnet_integration/test_bitnet.py

/content/transformers/tests/quantization/bitnet_integration
Traceback (most recent call last):
File "/content/transformers/src/transformers/utils/import_utils.py", line 1982, in _get_module
return importlib.import_module("." + module_name, self.name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 1204, in _gcd_import
File "", line 1176, in _find_and_load
File "", line 1147, in _find_and_load_unlocked
File "", line 690, in _load_unlocked
File "", line 940, in exec_module
File "", line 241, in _call_with_frames_removed
File "/content/transformers/src/transformers/models/opt/modeling_opt.py", line 37, in
from ...modeling_utils import PreTrainedModel
File "/content/transformers/src/transformers/modeling_utils.py", line 68, in
from .loss.loss_utils import LOSS_MAPPING
File "/content/transformers/src/transformers/loss/loss_utils.py", line 21, in
from .loss_deformable_detr import DeformableDetrForObjectDetectionLoss, DeformableDetrForSegmentationLoss
File "/content/transformers/src/transformers/loss/loss_deformable_detr.py", line 4, in
from ..image_transforms import center_to_corners_format
File "/content/transformers/src/transformers/image_transforms.py", line 21, in
from .image_utils import (
File "/content/transformers/src/transformers/image_utils.py", line 65, in
from torchvision import io as torchvision_io
File "/usr/local/lib/python3.11/dist-packages/torchvision/init.py", line 10, in
from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils # usort:skip
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torchvision/_meta_registrations.py", line 163, in
@torch .library.register_fake("torchvision::nms")
^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'torch.library' has no attribute 'register_fake'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/content/transformers/tests/quantization/bitnet_integration/test_bitnet.py", line 18, in
from transformers import (
File "", line 1229, in _handle_fromlist
File "/content/transformers/src/transformers/utils/import_utils.py", line 1970, in getattr
module = self._get_module(self._class_to_module[name])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/content/transformers/src/transformers/utils/import_utils.py", line 1984, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.models.opt.modeling_opt because of the following error (look up to see its traceback):
module 'torch.library' has no attribute 'register_fake'

%cd /content/transformers/tests/quantization/bitnet_integration
!python /content/transformers/tests/quantization/bitnet_integration/test_bitnet.py

/content/transformers/tests/quantization/bitnet_integration
2025-04-20 01:14:03.906915: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.

Requirements
pip install git+https://github.com/shumingma/transformers.git

We are actively working with the Hugging Face team to integrate the necessary code into the main transformers library. This installation method may change in the future.

Example
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "microsoft/bitnet-b1.58-2B-4T"

Load tokenizer and model

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16
)

Apply the chat template

messages = [
{"role": "system", "content": "You are a helpful AI assistant."},
{"role": "user", "content": "How are you?"},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
chat_input = tokenizer(prompt, return_tensors="pt").to(model.device)

Generate response

chat_outputs = model.generate(**chat_input, max_new_tokens=50)
response = tokenizer.decode(chat_outputs[0][chat_input['input_ids'].shape[-1]:], skip_special_tokens=True) # Decode only the response part
print("\nAssistant Response:", response)

not run never

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment