Papers
arxiv:2501.17811

Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling

Published on Jan 29
Authors:
,
,
,
,
,

Abstract

In this work, we introduce Janus-Pro, an advanced version of the previous work Janus. Specifically, Janus-Pro incorporates (1) an optimized training strategy, (2) expanded training data, and (3) scaling to larger model size. With these improvements, Janus-Pro achieves significant advancements in both multimodal understanding and text-to-image instruction-following capabilities, while also enhancing the stability of text-to-image generation. We hope this work will inspire further exploration in the field. Code and models are publicly available.

Community

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment

Models citing this paper 11

Browse 11 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2501.17811 in a dataset README.md to link it from this page.

Spaces citing this paper 59

Collections including this paper 5