deepseek-ai/DeepSeek-R1-Distill-Qwen-32B

#36 opened 2 months ago by

proudcats

Generate crashed by repeatedly generating <think>

#35 opened 2 months ago by

qpz

Weird......Dose it 32b model support vision ability ？

#34 opened 3 months ago by

baiall

Set base_model to deepseek-ai/DeepSeek-R1

#33 opened 3 months ago by

victor

How to build tools call system prompt?

#32 opened 3 months ago by

zhaocc1106

The input starts with the token "<|begin▁of▁sentence|>" repeated twice. / 输入开头重复2次“<|begin▁of▁sentence|>”

#31 opened 3 months ago by

Gya123

Can you distill qwen-2.5-72b?

#30 opened 3 months ago by

xldistance

weight files naming is not regular rule

#29 opened 3 months ago by

haili-tian

bos_token_id is defined incorrectly

#28 opened 3 months ago by

haili-tian

Longer context length

#26 opened 3 months ago by

comorado

Qwen 32B Compatibility on PC w/ Ryzen 7 Pro 8840HS w/ 780M Graphics 2x32GB RAM 1TB DDR5 SSD

#25 opened 3 months ago by

arzx

请问我在用llama-factory微调distill-qwen系列模型时，模型名称选哪个？

#24 opened 3 months ago by

wangda1

Update README.md

#23 opened 3 months ago by

Rizki-firman

Update README.md

#22 opened 3 months ago by

payam8499

Tokenizer config's `chat_template` removes everything before `</think>` XML closing tag

#21 opened 3 months ago by

jamesbraza

Consistency, can Deepseek pass? 一致性，deepseek能及格吗？

#20 opened 3 months ago by

zwpython

running on local machine

7

#19 opened 3 months ago by

saidavanam

Poor performance in the leaderboard?

7

#17 opened 3 months ago by

L29Ah

Add text-generation pipeline tag

#16 opened 3 months ago by

nielsr

comfyui-deepseek-r1

#15 opened 3 months ago by

zwpython

sharing something maybe beneficial ?

#13 opened 3 months ago by

9x25dillon

Please convert these models to GGUF format...

5

#12 opened 3 months ago by

Moodym

Support For Japanese Model

5

#11 opened 3 months ago by

alfredplpl

Tokenizer config is wrong

8

#10 opened 3 months ago by

stoshniwal

Garbage characters generated with using 32B

#9 opened 3 months ago by

carlosbdw

Please add a qwen2.5-72b distill

21

#8 opened 3 months ago by

warlock-edward

Does this have tooling support?

8

4

#7 opened 3 months ago by

xceptor

What temp are these expected to be used at?

#6 opened 3 months ago by

rombodawg

YaRN block required?