VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning Paper • 2505.22019 • Published 9 days ago • 10
Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model Paper • 2504.05594 • Published Apr 8 • 12
Long-Context Autoregressive Video Modeling with Next-Frame Prediction Paper • 2503.19325 • Published Mar 25 • 73
Edit Transfer: Learning Image Editing via Vision In-Context Relations Paper • 2503.13327 • Published Mar 17 • 29
ROICtrl: Boosting Instance Control for Visual Generation Paper • 2411.17949 • Published Nov 27, 2024 • 88
MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance Paper • 2312.11396 • Published Dec 18, 2023 • 11