Add paper abstract to model card

#4
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +13 -8
README.md CHANGED
@@ -1,18 +1,18 @@
1
  ---
2
  base_model: LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct
3
- base_model_relation: finetune
4
- license: other
5
- license_name: exaone
6
- license_link: LICENSE
7
  language:
8
  - en
9
  - ko
 
 
 
 
 
10
  tags:
11
  - lg-ai
12
  - exaone
13
  - exaone-deep
14
- pipeline_tag: text-generation
15
- library_name: transformers
16
  ---
17
 
18
  <p align="center">
@@ -27,6 +27,8 @@ We introduce EXAONE Deep, which exhibits superior capabilities in various reason
27
 
28
  For more details, please refer to our [documentation](https://arxiv.org/abs/2503.12524), [blog](https://www.lgresearch.ai/news/view?seq=543) and [GitHub](https://github.com/LG-AI-EXAONE/EXAONE-Deep).
29
 
 
 
30
  <p align="center">
31
  <img src="assets/exaone_deep_overall_performance.png", width="100%", style="margin: 40 auto;">
32
 
@@ -262,8 +264,11 @@ We provide the pre-quantized EXAONE Deep models with **AWQ** and several quantiz
262
 
263
  To achieve the expected performance, we recommend using the following configurations:
264
 
265
- 1. Ensure the model starts with `<thought>\n` for reasoning steps. The model's output quality may be degraded when you omit it. You can easily apply this feature by using `tokenizer.apply_chat_template()` with `add_generation_prompt=True`. Please check the example code on [Quickstart](#quickstart) section.
266
- 2. The reasoning steps of EXAONE Deep models enclosed by `<thought>\n...\n</thought>` usually have lots of tokens, so previous reasoning steps may be necessary to be removed in multi-turn situation. The provided tokenizer handles this automatically.
 
 
 
267
  3. Avoid using system prompt, and build the instruction on the user prompt.
268
  4. Additional instructions help the models reason more deeply, so that the models generate better output.
269
  - For math problems, the instructions **"Please reason step by step, and put your final answer within \boxed{}."** are helpful.
 
1
  ---
2
  base_model: LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct
 
 
 
 
3
  language:
4
  - en
5
  - ko
6
+ library_name: transformers
7
+ license: other
8
+ license_name: exaone
9
+ license_link: LICENSE
10
+ pipeline_tag: text-generation
11
  tags:
12
  - lg-ai
13
  - exaone
14
  - exaone-deep
15
+ base_model_relation: finetune
 
16
  ---
17
 
18
  <p align="center">
 
27
 
28
  For more details, please refer to our [documentation](https://arxiv.org/abs/2503.12524), [blog](https://www.lgresearch.ai/news/view?seq=543) and [GitHub](https://github.com/LG-AI-EXAONE/EXAONE-Deep).
29
 
30
+ **Abstract:** We present EXAONE Deep series, which exhibits superior capabilities in various reasoning tasks, including math and coding benchmarks. We train our models mainly on the reasoning-specialized dataset that incorporates long streams of thought processes. Evaluation results show that our smaller models, EXAONE Deep 2.4B and 7.8B, outperform other models of comparable size, while the largest model, EXAONE Deep 32B, demonstrates competitive performance against leading open-weight models. All EXAONE Deep models are openly available for research purposes and can be downloaded from https://huggingface.co/LGAI-EXAONE
31
+
32
  <p align="center">
33
  <img src="assets/exaone_deep_overall_performance.png", width="100%", style="margin: 40 auto;">
34
 
 
264
 
265
  To achieve the expected performance, we recommend using the following configurations:
266
 
267
+ 1. Ensure the model starts with `<thought>
268
+ ` for reasoning steps. The model's output quality may be degraded when you omit it. You can easily apply this feature by using `tokenizer.apply_chat_template()` with `add_generation_prompt=True`. Please check the example code on [Quickstart](#quickstart) section.
269
+ 2. The reasoning steps of EXAONE Deep models enclosed by `<thought>
270
+ ...
271
+ </thought>` usually have lots of tokens, so previous reasoning steps may be necessary to be removed in multi-turn situation. The provided tokenizer handles this automatically.
272
  3. Avoid using system prompt, and build the instruction on the user prompt.
273
  4. Additional instructions help the models reason more deeply, so that the models generate better output.
274
  - For math problems, the instructions **"Please reason step by step, and put your final answer within \boxed{}."** are helpful.