Submitted by MiniMax-AI 124 MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder · 20 authors 4
Submitted by ZacharyNovack 22 Fast Text-to-Audio Generation with Adversarial Post-Training · 11 authors 2
Submitted by akhaliq 16 AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale · 8 authors 2
Submitted by akhaliq 11 Aya Vision: Advancing the Frontier of Multilingual Multimodality · 25 authors 2
Submitted by Junjie-Ye 10 A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models · 15 authors 2
Submitted by jinghan23 10 Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging · 8 authors 2
Submitted by Omartificial-Intelligence-Space 7 Advancing Arabic Reverse Dictionary Systems: A Transformer-Based Approach with Dataset Construction Guidelines · 7 authors 2
Submitted by taiwang 5 NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged Information Guidance · 9 authors 2
Submitted by EdBianchi 4 SkillFormer: Unified Multi-View Video Understanding for Proficiency Estimation · 2 authors 2
Submitted by Omartificial-Intelligence-Space 4 Optimizing Retrieval-Augmented Generation: Analysis of Hyperparameter Impact on Performance and Efficiency · 4 authors 2
Submitted by deleted 2 ViMRHP: A Vietnamese Benchmark Dataset for Multimodal Review Helpfulness Prediction via Human-AI Collaborative Annotation · 4 authors 2
Submitted by onekq - Tests as Prompt: A Test-Driven-Development Benchmark for LLM Code Generation · 1 authors 2