JingyaoLi
/

ScienceLLaMA-1b

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions Community

JingyaoLi commited on 15 days ago

Commit

f4a4b5f

·

verified ·

1 Parent(s): daacca0

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md +16 -18

README.md CHANGED Viewed

@@ -7,28 +7,24 @@ tags:
 - full
 - generated_from_trainer
 model-index:
-- name: llama3.2_1b_instruct_pkl_1200k_e1_warmup0.1_cosinelr1e-6_seed42_maxl2048_a0.9_t1.0_logp5_freqt_0_b1.0_r1.0
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# llama3.2_1b_instruct_pkl_1200k_e1_warmup0.1_cosinelr1e-6_seed42_maxl2048_a0.9_t1.0_logp5_freqt_0_b1.0_r1.0
-This model is a fine-tuned version of [meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct) on the OpenMathInstruct-2-1M and the metamath_gsm8k datasets.
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure
@@ -50,7 +46,9 @@ The following hyperparameters were used during training:
 - num_epochs: 1
 ### Training results
 ### Framework versions

 - full
 - generated_from_trainer
 model-index:
+- name: ScienceLLaMA-1B
   results: []
 ---
+# ScienceLLaMA-3B
+<p align="center">
+• 🤗 <a href="https://huggingface.co/datasets/JingyaoLi/Science-Logits-1.2M" target="_blank">Data </a>
+• 🤗 <a href="https://huggingface.co/JingyaoLi/ScienceLLaMA-3b" target="_blank">ScienceLLaMA-3B </a>
+• 🤗 <a href="https://huggingface.co/JingyaoLi/ScienceLLaMA-1b" target="_blank">ScienceLLaMA-1B </a>
+• 🐱 <a href="Logits-based Finetuning" target="_blank">Code</a>
+• 📃 Paper (TO be released) <br>
+</p>
+This model is a fine-tuned with **Logits-Based Finetuning** on the [JingyaoLi/Science-Logits-1.2M](https://huggingface.co/datasets/JingyaoLi/Science-Logits-1.2M), which integrates the strengths of supervised learning and knowledge distillation by combining teacher logits with ground truth labels. This preserves both correctness and linguistic diversity.
+<div style="text-align: center;">
+    <img src="./images/example.png" alt="example" />
+</div>
 ## Training procedure
 - num_epochs: 1
 ### Training results
+<div style="text-align: center;">
+    <img src="./images/performance.png" alt="performance" />
+</div>
 ### Framework versions