How Did You Convert Felladrin/Llama-160M-Chat-v1 to ONNX Format?

by lakpriya - opened 25 days ago

25 days ago

Hi! I’m interested in using the Felladrin/Llama-160M-Chat-v1 model with Transformers.js, which works best with ONNX models—ideally in INT8 for better performance. I was wondering how you converted the model to ONNX format (and if you used any specific tools or steps to quantize it to INT8). Could you share your conversion process or any scripts you used? I'd love to replicate it for local usage. Thanks in advance!

Felladrin

Owner 24 days ago

Hi! At the time I converted it, I used the conversion script from transformers.js repository. But since then, we’ve created this space: https://huggingface.co/spaces/onnx-community/convert-to-onnx (which also makes use of the official conversion script), and it’s straightforward. You can also clone the space locally and run it through Docker or download the files and run it directly with python. Hopefully you’ll easily convert any model to ONNX (as far as the architecture is supported by ONNX library)!

lakpriya

24 days ago

Thanks so much, @Felladrin . I need to understand how we can create decoder_model_merged.onnx. I'm using these models to run on a mobile device with React Native, but it seems that using just model.onnx gives me poor results. Is there something specific or important about these decoder models that I should be aware of?

lakpriya

24 days ago

•

edited 24 days ago

Hi @Felladrin @Xenova

I used the following script to convert TinyLlama/TinyLlama-1.1B-Chat-v1.0 to ONNX:

!python3 /content/convert.py \
  --quantize \
  --task text-generation \
  --model_id TinyLlama/TinyLlama-1.1B-Chat-v1.0

After the conversion, I tried loading the quantized model, but I encountered the following error:

x.split is not a function (it is undefined)

This error come from the tokenizer.js file:

this.bpe_ranks = new Map(config.merges.map((x, i) => [x, i]));
this.merges = config.merges.map(x => x.split(this.BPE_SPLIT_TOKEN));

It looks like config.merges is undefined or not in the expected format.

I also tested my setup using onnx-community/TinyLlama-1.1B-Chat-v1.0-ONNX, and that version works fine, so it seems the issue is related to the conversion process.

My goal is to fine-tune this model on my dataset, convert it to ONNX, and run it with ONNX Runtime — but I’m blocked at this issue.

Could you help clarify what might be going wrong? Am I missing a step in the conversion process?

Thanks!

lakpriya

23 days ago

•

edited 23 days ago

Hi @Xenova ,

I'm sorry I know that I need to use the new @huggingface/transformers, but I can't use it since I'm running these models on mobile using react native app. so I'm using @xenova /transformers 2.17.2

        // Tokenizers >= 0.20.0 serializes BPE merges as a [string, string][] instead of a string[],
        // which resolves the ambiguity for merges containing spaces.
        const use_new_merge_format = Array.isArray(config.merges[0]);

        /** 

@type
	 {[string, string][]} */
        this.merges = use_new_merge_format
            ? /** 

@type
	 {[string, string][]} */(config.merges)
            : (/** 

@type
	 {string[]} */(config.merges)).map(x => /** 

@type
	 {[string, string]} */(x.split(' ', 2)));
        this.bpe_ranks = new Map(this.merges.map((x, i) => [JSON.stringify(x), i]));

I fixed the issue using this but now I get repetive words, only for those models which I got the x.split error. I'm not sure where is the issue now. what is the relative part for repeating. I really need to fix this issue.

Problem with my fine-tune models using same app

No problem with normal models using same app

lakpriya

22 days ago

@Xenova I believe there is a mismatch between the transformer and the model. Could you provide an older script that can convert the model to ONNX, which should be compatible with "@xenova/transformers": "^2.17.2"?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment