Update README.md
Browse files
README.md
CHANGED
@@ -184,13 +184,12 @@ The following datasets were used for the instruction tuning.
|
|
184 |
- Japanese translation of the lmsys-chat-1m dataset using DeepL, with synthetic instruction data created using the Llama-3.1-405B model.
|
185 |
- 'wo-pii' indicates removal of personally identifiable information.
|
186 |
|
187 |
-
- filtered
|
188 |
-
- Subset of magpie-ultra dataset, containing samples rated 'average'
|
189 |
-
- English version (filtered-magpie-ultra-en) translated to Japanese using Gemma 2 27B model.
|
190 |
|
191 |
- gemma-magpie
|
192 |
-
- Japanese
|
193 |
-
- Generated using prompts for specific category
|
194 |
|
195 |
## Risks and Limitations
|
196 |
|
|
|
184 |
- Japanese translation of the lmsys-chat-1m dataset using DeepL, with synthetic instruction data created using the Llama-3.1-405B model.
|
185 |
- 'wo-pii' indicates removal of personally identifiable information.
|
186 |
|
187 |
+
- filtered magpie-ultra
|
188 |
+
- Subset of the [magpie-ultra](https://huggingface.co/datasets/argilla/magpie-ultra-v0.1) dataset, containing samples rated as 'average,' 'good,' or 'excellent.'.
|
|
|
189 |
|
190 |
- gemma-magpie
|
191 |
+
- Japanese dataset.
|
192 |
+
- Generated using prompts for specific category words.
|
193 |
|
194 |
## Risks and Limitations
|
195 |
|