Update README.md
Browse files
README.md
CHANGED
@@ -10,13 +10,16 @@ model_type: llama
|
|
10 |
|
11 |
# Llama3.1 Swallow
|
12 |
|
13 |
-
|
|
|
|
|
|
|
|
|
|
|
14 |
|
|
|
15 |
|
16 |
-
|
17 |
-
|
18 |
-
We are excited to share the release schedule for our latest models:
|
19 |
-
- **October 08, 2024**: Released the [Llama-3.1-Swallow-8B-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-v0.1), [Llama-3.1-Swallow-8B-Instruct-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.1), [Llama-3.1-Swallow-70B-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-v0.1), and [Llama-3.1-Swallow-70B-Instruct-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-Instruct-v0.1).
|
20 |
|
21 |
## Swallow Model Index
|
22 |
|
@@ -27,7 +30,7 @@ We are excited to share the release schedule for our latest models:
|
|
27 |
|
28 |

|
29 |
|
30 |
-
|
31 |
|
32 |
## Model Details
|
33 |
|
@@ -198,9 +201,14 @@ The models released here are still in the early stages of our research and devel
|
|
198 |
|
199 |
## Acknowledgements
|
200 |
|
201 |
-
We thank Meta Research for releasing Llama 3.1 under
|
|
|
|
|
202 |
|
203 |
-
|
|
|
|
|
|
|
204 |
|
205 |
## License
|
206 |
|
|
|
10 |
|
11 |
# Llama3.1 Swallow
|
12 |
|
13 |
+
Llama 3.1 Swallow is a series of large language models (8B, 70B) that were built by continual pre-training on the [Meta Llama 3.1](https://huggingface.co/collections/meta-llama/llama-31-669fc079a0c406a149a5738f) models.
|
14 |
+
Llama 3.1 Swallow enhanced the Japanese language capabilities of the original Llama 3.1 while retaining the English language capabilities.
|
15 |
+
We use approximately 200 billion tokens that were sampled from a large Japanese web corpus (Swallow Corpus Version 2), Japanese and English Wikipedia articles, and mathematical and
|
16 |
+
coding contents, etc (see the Training Datasets section) for continual pre-training.
|
17 |
+
The instruction-tuned models (Instruct) were built by supervised fine-tuning (SFT) on the synthetic data specially built for Japanese.
|
18 |
+
See the Swallow Model Index section to find other model variants.
|
19 |
|
20 |
+
# Release History
|
21 |
|
22 |
+
- **October 08, 2024**: Released [Llama-3.1-Swallow-8B-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-v0.1), [Llama-3.1-Swallow-8B-Instruct-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.1), [Llama-3.1-Swallow-70B-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-v0.1), and [Llama-3.1-Swallow-70B-Instruct-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-Instruct-v0.1).
|
|
|
|
|
|
|
23 |
|
24 |
## Swallow Model Index
|
25 |
|
|
|
30 |
|
31 |

|
32 |
|
33 |
+
The website [https://swallow-llm.github.io/](https://swallow-llm.github.io/) provides large language models developed by the Swallow team.
|
34 |
|
35 |
## Model Details
|
36 |
|
|
|
201 |
|
202 |
## Acknowledgements
|
203 |
|
204 |
+
We thank Meta Research for releasing Llama 3.1 under a generous open license.
|
205 |
+
|
206 |
+
We received various supports including:
|
207 |
|
208 |
+
+ AIST project: “Research and Development of Foundation Models for Generative AI in the Physical Domain”
|
209 |
+
+ NEDO project: “Development of Artificial Intelligence Application Technology to Support Judgment in Design Risk Assessment Work Based on the Perspective of Skilled Persons" (JPNP18002) of “Development of Integration Technology as the Core of Next Generation Artificial Intelligence and Robotics”
|
210 |
+
+ MEXT project: "Formation of R&D center to ensure transparency and reliability of generative AI models"
|
211 |
+
+ AIST program: [Large Generative AI Development Support Program](https://abci.ai/en/link/lfm_support_program.html)
|
212 |
|
213 |
## License
|
214 |
|