Papers
arxiv:2505.06496

xGen-small Technical Report

Published on May 10
Authors:
,
,
,
,
,

Abstract

xGen-small, a 4B and 9B Transformer decoder, excels in long-context applications through a comprehensive pipeline including data curation, multi-stage pre-training, and targeted post-training, achieving notable performance in math and coding.

AI-generated summary

We introduce xGen-small, a family of 4B and 9B Transformer decoder models optimized for long-context applications. Our vertically integrated pipeline unites domain-balanced, frequency-aware data curation; multi-stage pre-training with quality annealing and length extension to 128k tokens; and targeted post-training via supervised fine-tuning, preference learning, and online reinforcement learning. xGen-small delivers strong performance across various tasks, especially in math and coding domains, while excelling at long context benchmarks.

Community

Sign up or log in to comment

Models citing this paper 5

Browse 5 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2505.06496 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2505.06496 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.