Parkerlambert123 commited on
Commit
56e7588
·
verified ·
1 Parent(s): 7fb75cb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -2
README.md CHANGED
@@ -117,10 +117,29 @@ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
117
  print(response)
118
  ```
119
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
120
  ### vllm
 
121
  For instance, you can easily start a service using [vLLM](https://github.com/vllm-project/vllm)
122
 
123
- ```python
124
  # install vllm
125
  pip install vllm>=0.6.4.post1
126
 
@@ -145,7 +164,8 @@ curl http://localhost:8000/v1/completions \
145
  ### SGLang
146
 
147
  You can also easily start a service using [SGLang](https://github.com/sgl-project/sglang)
148
- ```python
 
149
  # install SGLang
150
  pip install "sglang[all]>=0.4.5" --find-links https://flashinfer.ai/whl/cu124/torch2.5/flashinfer-python
151
 
@@ -170,11 +190,15 @@ curl http://localhost:8000/v1/completions \
170
  ### ollama
171
 
172
  You can download ollama using [this](https://ollama.com/download/)
 
173
  * quantization: Q4_K_M
 
174
  ```bash
175
  ollama run zhihu/zhi-writing-dsr1-14b
176
  ```
 
177
  * bf16
 
178
  ```bash
179
  ollama run zhihu/zhi-writing-dsr1-14b:bf16
180
  ```
 
117
  print(response)
118
  ```
119
 
120
+ ### ZhiLight
121
+
122
+ You can easily start a service using [ZhiLight](https://github.com/zhihu/ZhiLight)
123
+
124
+ ```bash
125
+ docker run -it --net=host --gpus='"device=0"' -v /path/to/model:/mnt/models --entrypoints="" ghcr.io/zhihu/zhilight/zhilight:0.4.17-cu124 python -m zhilight.server.openai.entrypoints.api_server --model-path /mnt/models --port 8000 --enable-reasoning --reasoning-parser deepseek-r1 --served-model-name Zhi-writing-dsr1-14b
126
+
127
+ curl http://localhost:8000/v1/completions \
128
+ -H "Content-Type: application/json" \
129
+ -d '{
130
+ "model": "Zhi-writing-dsr1-14b",
131
+ "prompt": "请你以鲁迅的口吻,写一篇介绍西湖醋鱼的文章",
132
+ "max_tokens": 4096,
133
+ "temperature": 0.6,
134
+ "top_p": 0.95
135
+ }'
136
+ ```
137
+
138
  ### vllm
139
+
140
  For instance, you can easily start a service using [vLLM](https://github.com/vllm-project/vllm)
141
 
142
+ ```bash
143
  # install vllm
144
  pip install vllm>=0.6.4.post1
145
 
 
164
  ### SGLang
165
 
166
  You can also easily start a service using [SGLang](https://github.com/sgl-project/sglang)
167
+
168
+ ```bash
169
  # install SGLang
170
  pip install "sglang[all]>=0.4.5" --find-links https://flashinfer.ai/whl/cu124/torch2.5/flashinfer-python
171
 
 
190
  ### ollama
191
 
192
  You can download ollama using [this](https://ollama.com/download/)
193
+
194
  * quantization: Q4_K_M
195
+
196
  ```bash
197
  ollama run zhihu/zhi-writing-dsr1-14b
198
  ```
199
+
200
  * bf16
201
+
202
  ```bash
203
  ollama run zhihu/zhi-writing-dsr1-14b:bf16
204
  ```