5 days ago

OSError: microsoft/bitnet-b1.58-2B-4T does not appear to have a file named configuration_bitnet.py. Checkout 'https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/tree/main'for available files.

Jellyfish042

5 days ago

same issue

frontierai

Microsoft org 4 days ago

To help me understand the issue better, could you please provide more information about how you're using the model? Specifically, it would be helpful to see the relevant code snippets or scripts.
Also, to ensure we're on the same page, can you confirm that you're using the version of transformers installed directly from the GitHub repository using this command?

pip install git+https://github.com/shumingma/transformers.git

shumingma

Microsoft org 4 days ago

Besides the version of transformers above, please ensure NOT to use trust_remote_code=True from the from_pretrained call.

programmerraja

2 days ago

@frontierai same issue i also facing I am running using docker like below commends

docker run --rm \
             --privileged=true \
             --shm-size=4g \
             -p 8000:8000 \
             vllm-cpu-env \
             --model=microsoft/bitnet-b1.58-2B-4T \
             --trust-remote-code
             --dtype=bfloat16 
      ```

rakmik

1 day ago

from transformers import AutoModel

model = AutoModel.from_pretrained("microsoft/bitnet-b1.58-2B-4T")

usr/local/lib/python3.11/dist-packages/huggingface_hub/file_download.py:896: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_auth.py:94: UserWarning:
The secret HF_TOKEN does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
warnings.warn(
The repository for microsoft/bitnet-b1.58-2B-4T contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co/microsoft/bitnet-b1.58-2B-4T.
You can avoid this prompt in future by passing the argument trust_remote_code=True.

Do you wish to run the custom code? [y/N] y
Could not locate the configuration_bitnet.py inside microsoft/bitnet-b1.58-2B-4T.

HTTPError Traceback (most recent call last)
/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_http.py in hf_raise_for_status(response, endpoint_name)
408 try:
--> 409 response.raise_for_status()
410 except HTTPError as e:

16 frames
HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/resolve/main/configuration_bitnet.py

The above exception was the direct cause of the following exception:

EntryNotFoundError Traceback (most recent call last)
EntryNotFoundError: 404 Client Error. (Request ID: Root=1-68042f0b-64d70ec2467336b62863647f;8c653f3b-556a-46e1-b14c-3d329e75cbf6)

Entry Not Found for url: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/resolve/main/configuration_bitnet.py.

The above exception was the direct cause of the following exception:

OSError Traceback (most recent call last)
/usr/local/lib/python3.11/dist-packages/transformers/utils/hub.py in cached_file(path_or_repo_id, filename, cache_dir, force_download, resume_download, proxies, token, revision, local_files_only, subfolder, repo_type, user_agent, _raise_exceptions_for_gated_repo, _raise_exceptions_for_missing_entries, _raise_exceptions_for_connection_errors, _commit_hash, **deprecated_kwargs)
450 if revision is None:
451 revision = "main"
--> 452 raise EnvironmentError(
453 f"{path_or_repo_id} does not appear to have a file named {full_filename}. Checkout "
454 f"'https://huggingface.co/{path_or_repo_id}/{revision}' for available files."

OSError: microsoft/bitnet-b1.58-2B-4T does not appear to have a file named configuration_bitnet.py. Checkout 'https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/main' for available files.

rakmik

1 day ago

!pip show transformers

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model = AutoModelForCausalLM.from_pretrained("microsoft/bitnet-b1.58-2B-4T", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("microsoft/bitnet-b1.58-2B-4T")

input_text = "Daniel went back to the the the garden. Mary travelled to the kitchen. Sandra journeyed to the kitchen. Sandra went to the hallway. John went to the bedroom. Mary went back to the garden. Where is Mary?\nAnswer:"

The change is here: remove .cuda() to keep the tensor on CPU

input_ids = tokenizer.encode(input_text, return_tensors="pt")
output = model.generate(input_ids, max_length=50, do_sample=False)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

Name: transformers
Version: 4.40.0
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: [email protected]
License: Apache 2.0 License
Location: /usr/local/lib/python3.11/dist-packages
Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, safetensors, tokenizers, tqdm
Required-by: compressed-tensors, peft, sentence-transformers, vllm, xgrammar, zetascale
/usr/local/lib/python3.11/dist-packages/huggingface_hub/file_download.py:896: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Could not locate the configuration_bitnet.py inside microsoft/bitnet-b1.58-2B-4T.

HTTPError Traceback (most recent call last)
/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_http.py in hf_raise_for_status(response, endpoint_name)
408 try:
--> 409 response.raise_for_status()
410 except HTTPError as e:

16 frames
HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/resolve/main/configuration_bitnet.py

The above exception was the direct cause of the following exception:

EntryNotFoundError Traceback (most recent call last)
EntryNotFoundError: 404 Client Error. (Request ID: Root=1-68042fb5-6980da7d29f0e7976db01b6d;75efdf5b-f74c-41d3-a09c-112b138d5457)

Entry Not Found for url: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/resolve/main/configuration_bitnet.py.

The above exception was the direct cause of the following exception:

OSError Traceback (most recent call last)
/usr/local/lib/python3.11/dist-packages/transformers/utils/hub.py in cached_file(path_or_repo_id, filename, cache_dir, force_download, resume_download, proxies, token, revision, local_files_only, subfolder, repo_type, user_agent, _raise_exceptions_for_gated_repo, _raise_exceptions_for_missing_entries, _raise_exceptions_for_connection_errors, _commit_hash, **deprecated_kwargs)
450 if revision is None:
451 revision = "main"
--> 452 raise EnvironmentError(
453 f"{path_or_repo_id} does not appear to have a file named {full_filename}. Checkout "
454 f"'https://huggingface.co/{path_or_repo_id}/{revision}' for available files."

OSError: microsoft/bitnet-b1.58-2B-4T does not appear to have a file named configuration_bitnet.py. Checkout 'https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/main' for available files.

rakmik

1 day ago

from transformers import AutoModel
model = AutoModel.from_pretrained("microsoft/bitnet-b1.58-2B-4T", trust_remote_code=True)

OSError: microsoft/bitnet-b1.58-2B-4T does not appear to have a file named configuration_bitnet.py. Checkout 'https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/main' for available files

rakmik

1 day ago

https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/discussions/16

rakmik

1 day ago

https://github.com/microsoft/unilm/issues/1711

rakmik

1 day ago

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "microsoft/bitnet-b1.58-2B-4T"

Load tokenizer and model

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
)

Apply the chat template

messages = [
{"role": "system", "content": "You are a helpful AI assistant."},
{"role": "user", "content": "How are you?"},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
chat_input = tokenizer(prompt, return_tensors="pt").to(model.device)

Generate response

chat_outputs = model.generate(**chat_input, max_new_tokens=11)
response = tokenizer.decode(chat_outputs[0][chat_input['input_ids'].shape[-1]:], skip_special_tokens=True) # Decode only the response part
print("\nAssistant Response:", response)

The repository for microsoft/bitnet-b1.58-2B-4T contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co/microsoft/bitnet-b1.58-2B-4T.
You can avoid this prompt in future by passing the argument trust_remote_code=True.

Do you wish to run the custom code? [y/N] y
Could not locate the configuration_bitnet.py inside microsoft/bitnet-b1.58-2B-4T.

OSError Traceback (most recent call last)
in <cell line: 0>()
6 # Load tokenizer and model
7 tokenizer = AutoTokenizer.from_pretrained(model_id)
----> 8 model = AutoModelForCausalLM.from_pretrained(
9 model_id,
10 )

5 frames
/usr/local/lib/python3.11/dist-packages/transformers/utils/hub.py in cached_files(path_or_repo_id, filenames, cache_dir, force_download, resume_download, proxies, token, revision, local_files_only, subfolder, repo_type, user_agent, _raise_exceptions_for_gated_repo, _raise_exceptions_for_missing_entries, _raise_exceptions_for_connection_errors, commit_hash, **deprecated_kwargs)
515 f"a file named {missing_entries[0]}" if len(missing_entries) == 1 else f"files named {(*missing_entries,)}"
516 )
--> 517 raise EnvironmentError(
518 f"{path_or_repo_id} does not appear to have {msg}. Checkout 'https://huggingface.co/{path_or_repo_id}/tree/{revision}'"
519 "for available files."

OSError: microsoft/bitnet-b1.58-2B-4T does not appear to have a file named configuration_bitnet.py. Checkout 'https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/tree/main'for available files.

rakmik

1 day ago

Successfully installed gguf-0.16.2 numpy-1.26.4 protobuf-4.25.6 tokenizers-0.21.1 torch-2.2.2+cpu transformers-4.51.3

Successfully installed numpy-1.25.2

rakmik

1 day ago

!pip install git+https://github.com/shumingma/transformers.git

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "microsoft/bitnet-b1.58-2B-4T"

Load tokenizer and model

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
)

Apply the chat template

messages = [
{"role": "system", "content": "You are a helpful AI assistant."},
{"role": "user", "content": "How are you?"},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
chat_input = tokenizer(prompt, return_tensors="pt").to(model.device)

Generate response

chat_outputs = model.generate(**chat_input, max_new_tokens=11)
response = tokenizer.decode(chat_outputs[0][chat_input['input_ids'].shape[-1]:], skip_special_tokens=True) # Decode only the response part
print("\nAssistant Response:", response)

RuntimeError: Failed to import transformers.models.bitnet.modeling_bitnet because of the following error (look up to see its traceback):
module 'torch.library' has no attribute 'register_fake'

rakmik

1 day ago

%cd tests/quantization/bitnet_integration
!python /content/transformers/tests/quantization/bitnet_integration/test_bitnet.py

/content/transformers/tests/quantization/bitnet_integration
Traceback (most recent call last):
File "/content/transformers/src/transformers/utils/import_utils.py", line 1982, in _get_module
return importlib.import_module("." + module_name, self.name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 1204, in _gcd_import
File "", line 1176, in _find_and_load
File "", line 1147, in _find_and_load_unlocked
File "", line 690, in _load_unlocked
File "", line 940, in exec_module
File "", line 241, in _call_with_frames_removed
File "/content/transformers/src/transformers/models/opt/modeling_opt.py", line 37, in
from ...modeling_utils import PreTrainedModel
File "/content/transformers/src/transformers/modeling_utils.py", line 68, in
from .loss.loss_utils import LOSS_MAPPING
File "/content/transformers/src/transformers/loss/loss_utils.py", line 21, in
from .loss_deformable_detr import DeformableDetrForObjectDetectionLoss, DeformableDetrForSegmentationLoss
File "/content/transformers/src/transformers/loss/loss_deformable_detr.py", line 4, in
from ..image_transforms import center_to_corners_format
File "/content/transformers/src/transformers/image_transforms.py", line 21, in
from .image_utils import (
File "/content/transformers/src/transformers/image_utils.py", line 65, in
from torchvision import io as torchvision_io
File "/usr/local/lib/python3.11/dist-packages/torchvision/init.py", line 10, in
from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils # usort:skip
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torchvision/_meta_registrations.py", line 163, in
@torch .library.register_fake("torchvision::nms")
^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'torch.library' has no attribute 'register_fake'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/content/transformers/tests/quantization/bitnet_integration/test_bitnet.py", line 18, in
from transformers import (
File "", line 1229, in _handle_fromlist
File "/content/transformers/src/transformers/utils/import_utils.py", line 1970, in getattr
module = self._get_module(self._class_to_module[name])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/content/transformers/src/transformers/utils/import_utils.py", line 1984, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.models.opt.modeling_opt because of the following error (look up to see its traceback):
module 'torch.library' has no attribute 'register_fake'

rakmik

1 day ago

%cd /content/transformers/tests/quantization/bitnet_integration
!python /content/transformers/tests/quantization/bitnet_integration/test_bitnet.py

/content/transformers/tests/quantization/bitnet_integration
2025-04-20 01:14:03.906915: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.

rakmik

1 day ago

Requirements
pip install git+https://github.com/shumingma/transformers.git

We are actively working with the Hugging Face team to integrate the necessary code into the main transformers library. This installation method may change in the future.

Example
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "microsoft/bitnet-b1.58-2B-4T"

Load tokenizer and model

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16
)

Apply the chat template

messages = [
{"role": "system", "content": "You are a helpful AI assistant."},
{"role": "user", "content": "How are you?"},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
chat_input = tokenizer(prompt, return_tensors="pt").to(model.device)

Generate response

chat_outputs = model.generate(**chat_input, max_new_tokens=50)
response = tokenizer.decode(chat_outputs[0][chat_input['input_ids'].shape[-1]:], skip_special_tokens=True) # Decode only the response part
print("\nAssistant Response:", response)

not run never

microsoft
/

bitnet-b1.58-2B-4T

configuration_bitnet.py missing

Do you wish to run the custom code? [y/N] y
Could not locate the configuration_bitnet.py inside microsoft/bitnet-b1.58-2B-4T.

The change is here: remove .cuda() to keep the tensor on CPU

Load tokenizer and model

Apply the chat template

Generate response

Do you wish to run the custom code? [y/N] y
Could not locate the configuration_bitnet.py inside microsoft/bitnet-b1.58-2B-4T.

Load tokenizer and model

Apply the chat template

Generate response

Load tokenizer and model

Apply the chat template

Generate response

configuration_bitnet.py missing

Do you wish to run the custom code? [y/N] yCould not locate the configuration_bitnet.py inside microsoft/bitnet-b1.58-2B-4T.

The change is here: remove .cuda() to keep the tensor on CPU

Load tokenizer and model

Apply the chat template

Generate response

Do you wish to run the custom code? [y/N] yCould not locate the configuration_bitnet.py inside microsoft/bitnet-b1.58-2B-4T.

Load tokenizer and model

Apply the chat template

Generate response

Load tokenizer and model

Apply the chat template

Generate response

Do you wish to run the custom code? [y/N] y
Could not locate the configuration_bitnet.py inside microsoft/bitnet-b1.58-2B-4T.

Do you wish to run the custom code? [y/N] y
Could not locate the configuration_bitnet.py inside microsoft/bitnet-b1.58-2B-4T.