Text Generation
Transformers
Safetensors
English
Japanese
llama
conversational
text-generation-inference
Taishi-N324 commited on
Commit
c15d6dc
·
verified ·
1 Parent(s): ce4b414

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -6
README.md CHANGED
@@ -50,7 +50,7 @@ The website [https://swallow-llm.github.io/](https://swallow-llm.github.io/) pro
50
  |Model|JCom.|JEMHopQA|NIILC|JSQuAD|XL-Sum|MGSM|WMT20-en-ja|WMT20-ja-en|JMMLU|JHumanEval|Ja Avg|
51
  |---|---|---|---|---|---|---|---|---|---|---|---|
52
  | |4-shot|4-shot|4-shot|4-shot|1-shot|4-shot|4-shot|4-shot|5-shot|0-shot| |
53
- | | |EM acc|Char-F1|Char-F1|Char-F1|ROUGE-2|EM acc|BLEU|BLEU|EM acc|pass@1| |
54
  | Qwen2-72B-Instruct | 0.9634 | 0.6268 | 0.5418 | 0.9210 | 0.1644 | 0.7840 | 0.2592 | 0.2327 | 0.7713 | 0.6909 | 0.5955 |
55
  | Qwen2.5-72B-Instruct | **0.9696** | 0.5699 | 0.5811 | 0.7381 | 0.1706 | **0.8360** | 0.2269 | 0.2179 | **0.7899** | 0.6256 | 0.5726 |
56
  | Llama 3 70B Instruct | 0.9419 | 0.6114 | 0.5506 | 0.9164 | 0.1912 | 0.7200 | 0.2708 | 0.2350 | 0.6789 | 0.6610 | 0.5777 |
@@ -63,10 +63,10 @@ The website [https://swallow-llm.github.io/](https://swallow-llm.github.io/) pro
63
 
64
  ### English tasks
65
 
66
- |Model|OpenBookQA|TriviaQA|HellaSWAG|SQuAD2.0|XWINO|MMLU|GSM8K|BBH|HumanEval|EnAvg|
67
  |---|---|---|---|---|---|---|---|---|---|---|
68
- ||4-shot|4-shot|4-shot|4-shot|4-shot|5-shot|4-shot|3-shot|0-shot||
69
- ||Acc|EMacc|Acc|EMacc|Acc|Acc|EMacc|CoTEMAcc|pass@1||
70
  | Qwen2-72B-Instruct | 0.4360 | 0.7588 | 0.6857 | 0.3913 | 0.9110 | 0.8391 | 0.8499 | 0.2436 | 0.6939 | 0.6455 |
71
  | Qwen2.5-72B-Instruct | **0.4540** | 0.6764 | **0.7064** | 0.3550 | 0.8895 | **0.8478** | **0.9113** | 0.4027 | 0.6165 | 0.6511 |
72
  | Llama 3 70B Instruct | 0.4400 | 0.7999 | 0.6552 | 0.4024 | 0.9127 | 0.7992 | 0.9052 | 0.8326 | 0.7555 | 0.7225 |
@@ -79,8 +79,8 @@ The website [https://swallow-llm.github.io/](https://swallow-llm.github.io/) pro
79
 
80
  ## MT-Bench JA
81
 
82
- |Model|coding|extraction|humanities|math|reasoning|roleplay|stem|writing|JMTAvg|
83
- |---|---|---|---|---|---|---|---|---|---|
84
  | Qwen2-72B-Instruct | 0.5699 | 0.7858 | 0.8222 | 0.5096 | **0.7032** | 0.7963 | 0.7728 | **0.8223** | 0.7228 |
85
  | Qwen2.5-72B-Instruct | 0.7060 | 0.7866 | 0.8122 | **0.6968** | 0.6536 | **0.8301** | 0.8060 | 0.7841 | 0.7594 |
86
  | Llama 3 70B Instruct | 0.5969 | 0.8410 | 0.7120 | 0.4481 | 0.4884 | 0.7117 | 0.6510 | 0.6900 | 0.6424 |
 
50
  |Model|JCom.|JEMHopQA|NIILC|JSQuAD|XL-Sum|MGSM|WMT20-en-ja|WMT20-ja-en|JMMLU|JHumanEval|Ja Avg|
51
  |---|---|---|---|---|---|---|---|---|---|---|---|
52
  | |4-shot|4-shot|4-shot|4-shot|1-shot|4-shot|4-shot|4-shot|5-shot|0-shot| |
53
+ | |EM acc|Char-F1|Char-F1|Char-F1|ROUGE-2|EM acc|BLEU|BLEU|EM acc|pass@1| |
54
  | Qwen2-72B-Instruct | 0.9634 | 0.6268 | 0.5418 | 0.9210 | 0.1644 | 0.7840 | 0.2592 | 0.2327 | 0.7713 | 0.6909 | 0.5955 |
55
  | Qwen2.5-72B-Instruct | **0.9696** | 0.5699 | 0.5811 | 0.7381 | 0.1706 | **0.8360** | 0.2269 | 0.2179 | **0.7899** | 0.6256 | 0.5726 |
56
  | Llama 3 70B Instruct | 0.9419 | 0.6114 | 0.5506 | 0.9164 | 0.1912 | 0.7200 | 0.2708 | 0.2350 | 0.6789 | 0.6610 | 0.5777 |
 
63
 
64
  ### English tasks
65
 
66
+ |Model|OpenBookQA|TriviaQA|HellaSWAG|SQuAD2.0|XWINO|MMLU|GSM8K|BBH|HumanEval|En Avg|
67
  |---|---|---|---|---|---|---|---|---|---|---|
68
+ | |4-shot|4-shot|4-shot|4-shot|4-shot|5-shot|4-shot|3-shot|0-shot| |
69
+ | |Acc|EM acc|Acc|EM acc|Acc|Acc|EM acc|CoT EM Acc|pass@1| |
70
  | Qwen2-72B-Instruct | 0.4360 | 0.7588 | 0.6857 | 0.3913 | 0.9110 | 0.8391 | 0.8499 | 0.2436 | 0.6939 | 0.6455 |
71
  | Qwen2.5-72B-Instruct | **0.4540** | 0.6764 | **0.7064** | 0.3550 | 0.8895 | **0.8478** | **0.9113** | 0.4027 | 0.6165 | 0.6511 |
72
  | Llama 3 70B Instruct | 0.4400 | 0.7999 | 0.6552 | 0.4024 | 0.9127 | 0.7992 | 0.9052 | 0.8326 | 0.7555 | 0.7225 |
 
79
 
80
  ## MT-Bench JA
81
 
82
+ | Model | coding | extraction | humanities | math | reasoning | roleplay | stem | writing | JMTAvg |
83
+ |-------|--------|------------|------------|------|-----------|----------|------|---------|--------|
84
  | Qwen2-72B-Instruct | 0.5699 | 0.7858 | 0.8222 | 0.5096 | **0.7032** | 0.7963 | 0.7728 | **0.8223** | 0.7228 |
85
  | Qwen2.5-72B-Instruct | 0.7060 | 0.7866 | 0.8122 | **0.6968** | 0.6536 | **0.8301** | 0.8060 | 0.7841 | 0.7594 |
86
  | Llama 3 70B Instruct | 0.5969 | 0.8410 | 0.7120 | 0.4481 | 0.4884 | 0.7117 | 0.6510 | 0.6900 | 0.6424 |