SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published 17 days ago • 171
AppAgentX: Evolving GUI Agents as Proficient Smartphone Users Paper • 2503.02268 • Published Mar 4 • 11
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated 16 days ago • 620k • 1.32k