Add paper abstract to model card
Browse filesThis PR adds the abstract of the paper to the model card.
README.md
CHANGED
@@ -1,18 +1,18 @@
|
|
1 |
---
|
2 |
base_model: LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct
|
3 |
-
base_model_relation: finetune
|
4 |
-
license: other
|
5 |
-
license_name: exaone
|
6 |
-
license_link: LICENSE
|
7 |
language:
|
8 |
- en
|
9 |
- ko
|
|
|
|
|
|
|
|
|
|
|
10 |
tags:
|
11 |
- lg-ai
|
12 |
- exaone
|
13 |
- exaone-deep
|
14 |
-
|
15 |
-
library_name: transformers
|
16 |
---
|
17 |
|
18 |
<p align="center">
|
@@ -27,6 +27,8 @@ We introduce EXAONE Deep, which exhibits superior capabilities in various reason
|
|
27 |
|
28 |
For more details, please refer to our [documentation](https://arxiv.org/abs/2503.12524), [blog](https://www.lgresearch.ai/news/view?seq=543) and [GitHub](https://github.com/LG-AI-EXAONE/EXAONE-Deep).
|
29 |
|
|
|
|
|
30 |
<p align="center">
|
31 |
<img src="assets/exaone_deep_overall_performance.png", width="100%", style="margin: 40 auto;">
|
32 |
|
@@ -262,8 +264,11 @@ We provide the pre-quantized EXAONE Deep models with **AWQ** and several quantiz
|
|
262 |
|
263 |
To achieve the expected performance, we recommend using the following configurations:
|
264 |
|
265 |
-
1. Ensure the model starts with `<thought
|
266 |
-
|
|
|
|
|
|
|
267 |
3. Avoid using system prompt, and build the instruction on the user prompt.
|
268 |
4. Additional instructions help the models reason more deeply, so that the models generate better output.
|
269 |
- For math problems, the instructions **"Please reason step by step, and put your final answer within \boxed{}."** are helpful.
|
|
|
1 |
---
|
2 |
base_model: LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct
|
|
|
|
|
|
|
|
|
3 |
language:
|
4 |
- en
|
5 |
- ko
|
6 |
+
library_name: transformers
|
7 |
+
license: other
|
8 |
+
license_name: exaone
|
9 |
+
license_link: LICENSE
|
10 |
+
pipeline_tag: text-generation
|
11 |
tags:
|
12 |
- lg-ai
|
13 |
- exaone
|
14 |
- exaone-deep
|
15 |
+
base_model_relation: finetune
|
|
|
16 |
---
|
17 |
|
18 |
<p align="center">
|
|
|
27 |
|
28 |
For more details, please refer to our [documentation](https://arxiv.org/abs/2503.12524), [blog](https://www.lgresearch.ai/news/view?seq=543) and [GitHub](https://github.com/LG-AI-EXAONE/EXAONE-Deep).
|
29 |
|
30 |
+
**Abstract:** We present EXAONE Deep series, which exhibits superior capabilities in various reasoning tasks, including math and coding benchmarks. We train our models mainly on the reasoning-specialized dataset that incorporates long streams of thought processes. Evaluation results show that our smaller models, EXAONE Deep 2.4B and 7.8B, outperform other models of comparable size, while the largest model, EXAONE Deep 32B, demonstrates competitive performance against leading open-weight models. All EXAONE Deep models are openly available for research purposes and can be downloaded from https://huggingface.co/LGAI-EXAONE
|
31 |
+
|
32 |
<p align="center">
|
33 |
<img src="assets/exaone_deep_overall_performance.png", width="100%", style="margin: 40 auto;">
|
34 |
|
|
|
264 |
|
265 |
To achieve the expected performance, we recommend using the following configurations:
|
266 |
|
267 |
+
1. Ensure the model starts with `<thought>
|
268 |
+
` for reasoning steps. The model's output quality may be degraded when you omit it. You can easily apply this feature by using `tokenizer.apply_chat_template()` with `add_generation_prompt=True`. Please check the example code on [Quickstart](#quickstart) section.
|
269 |
+
2. The reasoning steps of EXAONE Deep models enclosed by `<thought>
|
270 |
+
...
|
271 |
+
</thought>` usually have lots of tokens, so previous reasoning steps may be necessary to be removed in multi-turn situation. The provided tokenizer handles this automatically.
|
272 |
3. Avoid using system prompt, and build the instruction on the user prompt.
|
273 |
4. Additional instructions help the models reason more deeply, so that the models generate better output.
|
274 |
- For math problems, the instructions **"Please reason step by step, and put your final answer within \boxed{}."** are helpful.
|