anemll's picture
Upload folder using huggingface_hub
0c36737 verified
raw
history blame contribute delete
601 Bytes
model_info:
name: anemll-llama-3.2-1B-6bit-dequantized-ctx1024
version: 0.3.0
description: |
Demonstarates running llama-3.2-1B-6bit-dequantized on Apple Neural Engine
Context length: 1024
Batch size: 64
Chunks: 1
license: MIT
author: Anemll
framework: Core ML
language: Python
parameters:
context_length: 1024
batch_size: 64
lut_embeddings: none
lut_ffn: 4
lut_prefill: 6
lut_lmhead: 6
num_chunks: 1
model_prefix: llama
embeddings: llama_embeddings.mlmodelc
lm_head: llama_lm_head_lut6.mlmodelc
ffn: llama_FFN_lut4.mlmodelc