Update README.md
Browse files
README.md
CHANGED
@@ -178,6 +178,19 @@ print(output[0].outputs[0].text)
|
|
178 |
### Instruction Tuning
|
179 |
|
180 |
The following datasets were used for the instruction tuning.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
181 |
|
182 |
## Risks and Limitations
|
183 |
|
|
|
178 |
### Instruction Tuning
|
179 |
|
180 |
The following datasets were used for the instruction tuning.
|
181 |
+
|
182 |
+
- lmsys-chat-1m-synth-ja-wo-pii
|
183 |
+
|
184 |
+
- Japanese translation of the lmsys-chat-1m dataset using DeepL, with synthetic instruction data created using the Llama-3.1-405B model.
|
185 |
+
- 'wo-pii' indicates removal of personally identifiable information.
|
186 |
+
|
187 |
+
- filtered-magpie-ultra-ja
|
188 |
+
- Subset of magpie-ultra dataset, containing samples rated 'average', 'good', or 'excellent'.
|
189 |
+
- English version (filtered-magpie-ultra-en) translated to Japanese using Gemma 2 27B model.
|
190 |
+
|
191 |
+
- gemma-magpie
|
192 |
+
- Japanese-only dataset.
|
193 |
+
- Generated using prompts for specific category-based question-answering.
|
194 |
|
195 |
## Risks and Limitations
|
196 |
|