linear-moe-hub
/

Liger-GLA-8B

Text Generation

Model card Files Files and versions Community

Liger-GLA-8B / README.md

landisen's picture

Upload folder using huggingface_hub

e67d872 verified 2 months ago

|

history blame contribute delete

1.37 kB

	---
	license: apache-2.0
	library_name: transformers
	pipeline_tag: text-generation
	---
	# Liger-GLA-8B

	[\[📂 GitHub\]](https://github.com/OpenSparseLLMs/Linearization) [\[📜 Liger\]](https://huggingface.co/papers/2503.01496) [\[📑 Paper\]](https://arxiv.org/abs/2503.01496)

	We introduce Liger-GLA-8B, a gated linear recurrent model linearized from Transformer-based LLM.

	<p align="center">
	<img width="90%" src="figures/liger_framework.png">
	</p>


	Our Liger framework is compatible with various linear recurrent models with gating structures:


	\| Model Name \| Base Model \| Linear Structure \| HF Link \|
	\| --- \| --- \| --- \| --- \|
	\| Liger-GLA-8B \| [Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) \| [GLA](https://arxiv.org/abs/2312.06635) \| [🤗 link](https://huggingface.co/linear-moe-hub/Liger-GLA-8B) \|
	\| Liger-GSA-8B \| [Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) \| [GSA](https://arxiv.org/abs/2409.07146) \| [🤗 link](https://huggingface.co/linear-moe-hub/Liger-GSA-8B) \|




	## Citation

	If you find this repo useful, please cite and star our work:

	```bibtex
	@article{lan2025liger,
	title={Liger: Linearizing Large Language Models to Gated Recurrent Structures},
	author={Lan, Disen and Sun, Weigao and Hu, Jiaxi and Du, Jusen and Cheng, Yu},
	journal={arXiv preprint arXiv:2503.01496},
	year={2025}
	}
	```