SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published 19 days ago • 172
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published 12 days ago • 241
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems Paper • 2504.01990 • Published 26 days ago • 260
A Rank Stabilization Scaling Factor for Fine-Tuning with LoRA Paper • 2312.03732 • Published Nov 28, 2023 • 9
BOP Challenge 2024 on Model-Based and Model-Free 6D Object Pose Estimation Paper • 2504.02812 • Published 23 days ago • 5
UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image Paper • 2411.16106 • Published Nov 25, 2024 • 1
LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization Paper • 2503.08619 • Published Mar 11 • 20
OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models Paper • 2503.08686 • Published Mar 11 • 19