|
--- |
|
license: apache-2.0 |
|
library_name: transformers |
|
pipeline_tag: text-generation |
|
--- |
|
# Liger-GLA-8B |
|
|
|
[\[π GitHub\]](https://github.com/OpenSparseLLMs/Linearization) [\[π Liger\]](https://huggingface.co/papers/2503.01496) [\[π Paper\]](https://arxiv.org/abs/2503.01496) |
|
|
|
We introduce **Liger-GLA-8B**, a gated linear recurrent model linearized from Transformer-based LLM. |
|
|
|
<p align="center"> |
|
<img width="90%" src="figures/liger_framework.png"> |
|
</p> |
|
|
|
|
|
Our **Liger** framework is compatible with various linear recurrent models with gating structures: |
|
|
|
|
|
| Model Name | Base Model | Linear Structure | HF Link | |
|
| --- | --- | --- | --- | |
|
| Liger-GLA-8B | [Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | [GLA](https://arxiv.org/abs/2312.06635) | [π€ link](https://huggingface.co/linear-moe-hub/Liger-GLA-8B) | |
|
| Liger-GSA-8B | [Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | [GSA](https://arxiv.org/abs/2409.07146) | [π€ link](https://huggingface.co/linear-moe-hub/Liger-GSA-8B) | |
|
|
|
|
|
|
|
|
|
## Citation |
|
|
|
If you find this repo useful, please cite and star our work: |
|
|
|
```bibtex |
|
@article{lan2025liger, |
|
title={Liger: Linearizing Large Language Models to Gated Recurrent Structures}, |
|
author={Lan, Disen and Sun, Weigao and Hu, Jiaxi and Du, Jusen and Cheng, Yu}, |
|
journal={arXiv preprint arXiv:2503.01496}, |
|
year={2025} |
|
} |
|
``` |