Update README.md
Browse filesAdd more information on post-training
README.md
CHANGED
@@ -18,7 +18,7 @@ base_model:
|
|
18 |
# Llama-Krikri-8B-Instruct: An Instruction-tuned Large Language Model for the Greek language
|
19 |
|
20 |
Following the release of [Meltemi-7B](https://huggingface.co/ilsp/Meltemi-7B-v1) on the 26th March 2024, we are happy to welcome Krikri to the family of ILSP open Greek LLMs.
|
21 |
-
Krikri is built on top of [Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B), extending its capabilities for Greek through continual pretraining on a large corpus of high-quality and locally relevant Greek texts. We present **Llama-Krikri-8B-Instruct**, along with the base model, [Llama-Krikri-8B-Base](https://huggingface.co/ilsp/Llama-Krikri-8B-Base)
|
22 |
|
23 |

|
24 |
|
@@ -59,7 +59,25 @@ Llama-Krikri-8B-Instruct is the result of post-training Llama-Kriki-8B-Base and
|
|
59 |
- Conversion or structured extraction (e.g., XML, JSON) in data-to-text & text-to-data settings.
|
60 |
- Analytical thinking and Chain-of-Thought (CoT) reasoning for problem-solving.
|
61 |
|
62 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
63 |
|
64 |
|
65 |
# How to use
|
|
|
18 |
# Llama-Krikri-8B-Instruct: An Instruction-tuned Large Language Model for the Greek language
|
19 |
|
20 |
Following the release of [Meltemi-7B](https://huggingface.co/ilsp/Meltemi-7B-v1) on the 26th March 2024, we are happy to welcome Krikri to the family of ILSP open Greek LLMs.
|
21 |
+
Krikri is built on top of [Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B), extending its capabilities for Greek through continual pretraining on a large corpus of high-quality and locally relevant Greek texts. We present **Llama-Krikri-8B-Instruct**, along with the base model, [Llama-Krikri-8B-Base](https://huggingface.co/ilsp/Llama-Krikri-8B-Base)
|
22 |
|
23 |

|
24 |
|
|
|
59 |
- Conversion or structured extraction (e.g., XML, JSON) in data-to-text & text-to-data settings.
|
60 |
- Analytical thinking and Chain-of-Thought (CoT) reasoning for problem-solving.
|
61 |
|
62 |
+
We used a multi-stage process in order to build Llama-Krikri-8B-Instruct which includes:
|
63 |
+
- 2-stage Supervised Fine-Tuning with a combination of Greek & English instruction-response pairs
|
64 |
+
- **Stage 1**: **856,946** instruction-response pairs (371,379 Greek + 485,567 English)
|
65 |
+
- **Stage 2**: **638,408** instruction-response pairs (279,948 Greek + 358,460 English)
|
66 |
+
- Alignment with a combination of Greek & English preference triplets
|
67 |
+
- **Length Normalized DPO**: **92,394** preference triplets (47,132 Greek + 45,262 English)
|
68 |
+
|
69 |
+
To build the SFT & DPO data, we utilized various methodologies including:
|
70 |
+
- Collecting existing high-quality datasets such as [Tulu 3](https://huggingface.co/datasets/allenai/tulu-3-sft-mixture), [SmolTalk](https://huggingface.co/datasets/HuggingFaceTB/smoltalk), [MAGPIE Ultra](https://huggingface.co/datasets/argilla/magpie-ultra-v1.0), [Orca Agent Instruct](https://huggingface.co/datasets/microsoft/orca-agentinstruct-1M-v1), [IFEval Like Data](https://huggingface.co/datasets/argilla/ifeval-like-data), [UltraFeedback](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized), [NVIDIA HelpSteer2](https://huggingface.co/datasets/nvidia/HelpSteer2), [Intel Orca](https://huggingface.co/datasets/argilla/distilabel-intel-orca-dpo-pairs), [UltraMedical](https://huggingface.co/datasets/TsinghuaC3I/UltraMedical-Preference), and other datasets focused on safety, truthfulness, and instruction-following.
|
71 |
+
- Translating various data into Greek using an in-house translation tool.
|
72 |
+
- Distilling (with the MAGPIE methodology) models which exhibit strong performance in Greek, such as [Gemma 2 27B IT](https://huggingface.co/google/gemma-2-27b-it).
|
73 |
+
- Scoring data with the [Skywork Reward Gemma 2 27B v0.2](https://huggingface.co/Skywork/Skywork-Reward-Gemma-2-27B-v0.2) Reward Model and filtering using rule-based filters.
|
74 |
+
- Creating data for sentence and document translation using high-quality parallel corpora mainly from [ELRC-SHARE](https://elrc-share.eu/).
|
75 |
+
- Synthetically extracting question-answer pairs (RAG) and multi-turn dialogues from diverse sources such as Wikipedia, EUR-LEX, Greek School Books, and Kallipos.
|
76 |
+
|
77 |
+
# Evaluation
|
78 |
+
|
79 |
+
|
80 |
+
🚨 **More information on post-training, methdology, and evaluation coming soon.** 🚨
|
81 |
|
82 |
|
83 |
# How to use
|