FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding Paper • 2504.09925 • Published 10 days ago • 38
Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation Paper • 2503.14905 • Published Mar 19 • 20
Long-Context Autoregressive Video Modeling with Next-Frame Prediction Paper • 2503.19325 • Published about 1 month ago • 72
LEGION: Learning to Ground and Explain for Synthetic Image Detection Paper • 2503.15264 • Published Mar 19 • 21