Orpheus Multilingual Research Release Collection Beta Release of multilingual models. • 12 items • Updated 10 days ago • 76
MoCha: Towards Movie-Grade Talking Character Synthesis Paper • 2503.23307 • Published 22 days ago • 124
WavTokenizer-Medium-Large Collection https://arxiv.org/abs/2408.16532 • 4 items • Updated Feb 25 • 11
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling Paper • 2408.16532 • Published Aug 29, 2024 • 51
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM Paper • 2503.04724 • Published Mar 6 • 69
PERSE: Personalized 3D Generative Avatars from A Single Portrait Paper • 2412.21206 • Published Dec 30, 2024 • 19
view article Article Transformers.js v3: WebGPU support, new models & tasks, and more… Oct 22, 2024 • 73
Phi-4 Collection Phi-4 family of small language and multi-modal models. • 9 items • Updated 3 days ago • 116
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching Paper • 2410.06885 • Published Oct 9, 2024 • 47