zRzRzRzRzRzRzR
commited on
Commit
·
fb49307
1
Parent(s):
ca8ed56
README.md
CHANGED
@@ -13,6 +13,17 @@ library_name: transformers
|
|
13 |
|
14 |
Based on our latest technological advancements, we have trained a `GLM-4-0414` series model. During pretraining, we incorporated more code-related and reasoning-related data. In the alignment phase, we optimized the model specifically for agent capabilities. As a result, the model's performance in agent tasks such as tool use, web search, and coding has been significantly improved.
|
15 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
## Inference Code
|
17 |
|
18 |
Make Sure Using `transforemrs>=4.51.3`.
|
|
|
13 |
|
14 |
Based on our latest technological advancements, we have trained a `GLM-4-0414` series model. During pretraining, we incorporated more code-related and reasoning-related data. In the alignment phase, we optimized the model specifically for agent capabilities. As a result, the model's performance in agent tasks such as tool use, web search, and coding has been significantly improved.
|
15 |
|
16 |
+

|
17 |
+
|
18 |
+
| Models | IFEval | SWE-Bench | BFCL-v3 (Overall) | BFCL-v3 (MultiTurn) | TAU-Bench (Retail) | TAU-Bench (Airline) | SimpleQA | HotpotQA |
|
19 |
+
|------------------|---------|------------------|-------------------|---------------------|--------------------|---------------------|----------|----------|
|
20 |
+
| Qwen2.5-Max | 85.6 | 24.4 | 50.9 | 30.5 | 58.3 | 22.0 | 79.0 | 52.8 |
|
21 |
+
| GPT-4o-1120 | 81.9 | 38.8 | 69.6 | 41.0 | 62.8 | 46.0 | 82.8 | 63.9 |
|
22 |
+
| DeepSeek-V3-0324 | 83.4 | 38.8(oh) | 66.2 | 35.8 | 60.7 | 32.4 | 82.6 | 54.6 |
|
23 |
+
| DeepSeek-R1 | 84.3 | 34(oh)/ 49.2(al) | 57.5 | 12.4 | 33.0 | 37.3 | 83.9 | 63.1 |
|
24 |
+
| GLM-4-32B-0414 | 86.5 | | 69.6 | 41.5 | 68.7 | 51.2 | 88.1 | 63.8 |
|
25 |
+
|
26 |
+
|
27 |
## Inference Code
|
28 |
|
29 |
Make Sure Using `transforemrs>=4.51.3`.
|