anemll
/

anemll-dwq-llama-3.2-1B-4b-pf6b-ctx1024_0.3.0

Apple Neural Engine

Model card Files Files and versions Community

anemll-dwq-llama-3.2-1B-4b-pf6b-ctx1024_0.3.0 / meta.yaml

anemll's picture

Upload folder using huggingface_hub

0c36737 verified 19 days ago

history blame contribute delete

601 Bytes

	model_info:
	name: anemll-llama-3.2-1B-6bit-dequantized-ctx1024
	version: 0.3.0
	description: \|
	Demonstarates running llama-3.2-1B-6bit-dequantized on Apple Neural Engine
	Context length: 1024
	Batch size: 64
	Chunks: 1
	license: MIT
	author: Anemll
	framework: Core ML
	language: Python
	parameters:
	context_length: 1024
	batch_size: 64
	lut_embeddings: none
	lut_ffn: 4
	lut_prefill: 6
	lut_lmhead: 6
	num_chunks: 1
	model_prefix: llama
	embeddings: llama_embeddings.mlmodelc
	lm_head: llama_lm_head_lut6.mlmodelc
	ffn: llama_FFN_lut4.mlmodelc