Rabbit3-Ko-4B / README.md

Update README.md

ba9e2f2 verified 26 days ago

4.59 kB

	---
	license: apache-2.0
	language:
	- ko
	- en
	base_model:
	- Qwen/Qwen3-4B
	pipeline_tag: text-generation
	---

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/64633ebb39359568c63b52ad/r5EnnbDV6eGQQBeNBHu7K.png)

	### Model Details
	- Name: CarrotAI/Rabbit3-Ko-4B
	- Version: 4B Instruct
	- Base Model: Qwen/Qwen3-4B
	- Languages: Korean, English
	- Model Type: Large Language Model (Instruction-tuned)


	Qwen3-4B 기반의 LLM 모델로 한국어 및 영어 데이터셋을 사용하여 파인튜닝한 한국어 모델입니다.
	- 2025.05.16 일반모드로만 사용 가능합니다.


	### Score

	\| Tasks \|Version\| Filter \|n-shot\| Metric \| \|Value \| \|Stderr\|
	\|------------------\|-------\|----------------\|-----:\|-----------------------\|---\|-----:\|---\|------\|
	\|gsm8k \| 3\|flexible-extract\| 5\|exact_match \|↑ \|0.8400\|± \|0.0101\|
	\| \| \|strict-match \| 5\|exact_match \|↑ \|0.8378\|± \|0.0102\|
	\|hrm8k \| N/A\| \| \| \| \| \| \| \|
	\| - hrm8k_gsm8k \| 1\|none \| 0\|exact_match \|↑ \|0.8196\|± \|0.0106\|
	\| - hrm8k_ksm \| 1\|none \| 0\|exact_match \|↑ \|0.0511\|± \|0.0058\|
	\| - hrm8k_math \| 1\|none \| 0\|exact_match \|↑ \|0.5539\|± \|0.0093\|
	\| - hrm8k_mmmlu \| 1\|none \| 0\|exact_match \|↑ \|0.5362\|± \|0.0230\|
	\| - hrm8k_omni_math\| 1\|none \| 0\|exact_match \|↑ \|0.1812\|± \|0.0088\|
	\|ifeval \| 4\|none \| 0\|inst_level_loose_acc \|↑ \|0.8753\|± \| N/A\|
	\| \| \|none \| 0\|inst_level_strict_acc \|↑ \|0.8609\|± \| N/A\|
	\| \| \|none \| 0\|prompt_level_loose_acc \|↑ \|0.8244\|± \|0.0164\|
	\| \| \|none \| 0\|prompt_level_strict_acc\|↑ \|0.8078\|± \|0.0170\|


	\|Groups\|Version\|Filter\|n-shot\| Metric \| \|Value \| \|Stderr\|
	\|------\|------:\|------\|-----:\|--------\|---\|-----:\|---\|------\|
	\|haerae\| 1\|none \| 0\|acc \|↑ \|0.6654\|± \|0.0140\|
	\| \| \|none \| 0\|acc_norm\|↑ \|0.6654\|± \|0.0140\|
	\|kobest\| 1\|none \| 0\|acc \|↑ \|0.7768\|± \|0.0057\|
	\| \| \|none \| 0\|acc_norm\|↑ \|0.5880\|± \|0.0220\|
	\| \| \|none \| 0\|f1 \|↑ \|0.7764\|± \| N/A\|


	\| Groups \|Version\|Filter\|n-shot\| Metric \| \|Value \| \|Stderr\|
	\|-------------------------------\|------:\|------\|-----:\|-----------\|---\|-----:\|---\|-----:\|
	\|kmmlu_direct \| 2\|none \| 0\|exact_match\|↑ \|0.5212\|± \|0.0026\|
	\| - kmmlu_direct_applied_science\| 2\|none \| 0\|exact_match\|↑ \|0.4997\|± \|0.0046\|
	\| - kmmlu_direct_humss \| 2\|none \| 0\|exact_match\|↑ \|0.5365\|± \|0.0068\|
	\| - kmmlu_direct_other \| 2\|none \| 0\|exact_match\|↑ \|0.5130\|± \|0.0053\|
	\| - kmmlu_direct_stem \| 2\|none \| 0\|exact_match\|↑ \|0.5455\|± \|0.0048\|


	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "CarrotAI/Rabbit3-Ko-4B"

	# load the tokenizer and the model
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype="auto",
	device_map="auto"
	)

	# prepare the model input
	prompt = "Give me a short introduction to large language model."
	messages = [
	{"role": "user", "content": prompt}
	]
	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True,
	enable_thinking=True # Switches between thinking and non-thinking modes. Default is True.
	)
	model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

	# conduct text completion
	generated_ids = model.generate(
	**model_inputs,
	max_new_tokens=32768
	)
	output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()

	# parsing thinking content
	try:
	# rindex finding 151668 (</think>)
	index = len(output_ids) - output_ids[::-1].index(151668)
	except ValueError:
	index = 0

	thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
	content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")

	print("thinking content:", thinking_content)
	print("content:", content)
	```

	For deployment, you can use sglang>=0.4.6.post1 or vllm>=0.8.5 or to create an OpenAI-compatible API endpoint: