AnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views Paper • 2505.23716 • Published 3 days ago • 25
RenderFormer: Transformer-based Neural Rendering of Triangle Meshes with Global Illumination Paper • 2505.21925 • Published 5 days ago • 32
DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction Paper • 2505.21473 • Published 5 days ago • 13
Diffusion Classifiers Understand Compositionality, but Conditions Apply Paper • 2505.17955 • Published 9 days ago • 19
One RL to See Them All: Visual Triple Unified Reinforcement Learning Paper • 2505.18129 • Published 9 days ago • 56
Model Already Knows the Best Noise: Bayesian Active Noise Selection via Attention in Video Diffusion Model Paper • 2505.17561 • Published 9 days ago • 29
Vid2World: Crafting Video Diffusion Models to Interactive World Models Paper • 2505.14357 • Published 12 days ago • 25
view article Article Exploring Quantization Backends in Diffusers By derekl35 and 2 others • 12 days ago • 27
MMaDA: Multimodal Large Diffusion Language Models Paper • 2505.15809 • Published 11 days ago • 83
Emerging Properties in Unified Multimodal Pretraining Paper • 2505.14683 • Published 12 days ago • 124
Hybrid 3D-4D Gaussian Splatting for Fast Dynamic Scene Representation Paper • 2505.13215 • Published 13 days ago • 28
view article Article You could have designed state of the art positional encoding By FL33TW00D-HF • Nov 25, 2024 • 280
view article Article The Transformers Library: standardizing model definitions By lysandre and 3 others • 18 days ago • 107
MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder Paper • 2505.07916 • Published 20 days ago • 119
Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets Paper • 2505.07747 • Published 20 days ago • 59
Continuous Visual Autoregressive Generation via Score Maximization Paper • 2505.07812 • Published 20 days ago • 12