DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging Paper • 2504.12364 • Published 5 days ago • 16
HiScene: Creating Hierarchical 3D Scenes with Isometric View Generation Paper • 2504.13072 • Published 4 days ago • 5
ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness Paper • 2504.10514 • Published 11 days ago • 45
Cobra: Efficient Line Art COlorization with BRoAder References Paper • 2504.12240 • Published 5 days ago • 25
An Empirical Study of GPT-4o Image Generation Capabilities Paper • 2504.05979 • Published 13 days ago • 59
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation Paper • 2504.02160 • Published 19 days ago • 33
OmniSVG: A Unified Scalable Vector Graphics Generation Model Paper • 2504.06263 • Published 13 days ago • 146
Concept Lancet: Image Editing with Compositional Representation Transplant Paper • 2504.02828 • Published 18 days ago • 16
VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning Paper • 2504.02949 • Published 18 days ago • 19
Comprehensive Relighting: Generalizable and Consistent Monocular Human Relighting and Harmonization Paper • 2504.03011 • Published 18 days ago • 9
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation Paper • 2504.02542 • Published 18 days ago • 41
SkyReels-A2: Compose Anything in Video Diffusion Transformers Paper • 2504.02436 • Published 18 days ago • 35
JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization Paper • 2503.23377 • Published 22 days ago • 52