V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models Paper ⢠2504.06148 ⢠Published 15 days ago ⢠13 ⢠2
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models Paper ⢠2503.20198 ⢠Published 28 days ago ⢠4 ⢠3
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation Paper ⢠2503.20672 ⢠Published 28 days ago ⢠14 ⢠3
TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation Paper ⢠2502.07870 ⢠Published Feb 11 ⢠44 ⢠2