Text Generation
Transformers
Safetensors
English
Japanese
llama
conversational
text-generation-inference
Taishi-N324 commited on
Commit
22a477e
·
verified ·
1 Parent(s): c09e444

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -8
README.md CHANGED
@@ -10,13 +10,16 @@ model_type: llama
10
 
11
  # Llama3.1 Swallow
12
 
13
- Our Swallow model has undergone continual pre-training from the [Llama 3.1 family](https://huggingface.co/collections/meta-llama/llama-31-669fc079a0c406a149a5738f), primarily with the addition of Japanese language data. The Instruct versions use supervised fine-tuning (SFT). Links to other models can be found in the index.
 
 
 
 
 
14
 
 
15
 
16
- # Model Release Updates
17
-
18
- We are excited to share the release schedule for our latest models:
19
- - **October 08, 2024**: Released the [Llama-3.1-Swallow-8B-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-v0.1), [Llama-3.1-Swallow-8B-Instruct-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.1), [Llama-3.1-Swallow-70B-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-v0.1), and [Llama-3.1-Swallow-70B-Instruct-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-Instruct-v0.1).
20
 
21
  ## Swallow Model Index
22
 
@@ -27,7 +30,7 @@ We are excited to share the release schedule for our latest models:
27
 
28
  ![logo](./logo.png)
29
 
30
- This repository provides large language models developed by [Swallow-LLM](https://swallow-llm.github.io/).
31
 
32
  ## Model Details
33
 
@@ -198,9 +201,14 @@ The models released here are still in the early stages of our research and devel
198
 
199
  ## Acknowledgements
200
 
201
- We thank Meta Research for releasing Llama 3.1 under an open license for others to build on.
 
 
202
 
203
- Our project is supported by the [Large Generative AI Development Support Program](https://abci.ai/en/link/lfm_support_program.html) of the National Institute of Advanced Industrial Science and Technology.
 
 
 
204
 
205
  ## License
206
 
 
10
 
11
  # Llama3.1 Swallow
12
 
13
+ Llama 3.1 Swallow is a series of large language models (8B, 70B) that were built by continual pre-training on the [Meta Llama 3.1](https://huggingface.co/collections/meta-llama/llama-31-669fc079a0c406a149a5738f) models.
14
+ Llama 3.1 Swallow enhanced the Japanese language capabilities of the original Llama 3.1 while retaining the English language capabilities.
15
+ We use approximately 200 billion tokens that were sampled from a large Japanese web corpus (Swallow Corpus Version 2), Japanese and English Wikipedia articles, and mathematical and
16
+ coding contents, etc (see the Training Datasets section) for continual pre-training.
17
+ The instruction-tuned models (Instruct) were built by supervised fine-tuning (SFT) on the synthetic data specially built for Japanese.
18
+ See the Swallow Model Index section to find other model variants.
19
 
20
+ # Release History
21
 
22
+ - **October 08, 2024**: Released [Llama-3.1-Swallow-8B-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-v0.1), [Llama-3.1-Swallow-8B-Instruct-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.1), [Llama-3.1-Swallow-70B-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-v0.1), and [Llama-3.1-Swallow-70B-Instruct-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-Instruct-v0.1).
 
 
 
23
 
24
  ## Swallow Model Index
25
 
 
30
 
31
  ![logo](./logo.png)
32
 
33
+ The website [https://swallow-llm.github.io/](https://swallow-llm.github.io/) provides large language models developed by the Swallow team.
34
 
35
  ## Model Details
36
 
 
201
 
202
  ## Acknowledgements
203
 
204
+ We thank Meta Research for releasing Llama 3.1 under a generous open license.
205
+
206
+ We received various supports including:
207
 
208
+ + AIST project: “Research and Development of Foundation Models for Generative AI in the Physical Domain”
209
+ + NEDO project: “Development of Artificial Intelligence Application Technology to Support Judgment in Design Risk Assessment Work Based on the Perspective of Skilled Persons" (JPNP18002) of “Development of Integration Technology as the Core of Next Generation Artificial Intelligence and Robotics”
210
+ + MEXT project: "Formation of R&D center to ensure transparency and reliability of generative AI models"
211
+ + AIST program: [Large Generative AI Development Support Program](https://abci.ai/en/link/lfm_support_program.html)
212
 
213
  ## License
214