Running on Zero 31 31 TRELLIS - Multiple Imagen a 3D 🚀 Scalable and Versatile 3D Generation from images
Running on Zero 286 286 Joy Caption Beta One 🖼 Generate captions for images based on various styles and formats
Vid2World: Crafting Video Diffusion Models to Interactive World Models Paper • 2505.14357 • Published 18 days ago • 25
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play Paper • 2505.02707 • Published May 5 • 82
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group Learning Paper • 2504.14509 • Published Apr 20 • 51
PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters Paper • 2504.08791 • Published Apr 7 • 132
AerialMegaDepth: Learning Aerial-Ground Reconstruction and View Synthesis Paper • 2504.13157 • Published Apr 17 • 21
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published Apr 18 • 127
Describe Anything: Detailed Localized Image and Video Captioning Paper • 2504.16072 • Published Apr 22 • 60