Vaibhav Srivastav's picture

Vaibhav Srivastav PRO

reach-vb

·

https://vaibhavs10.github.io

AI & ML interests

TTS + LM performance prediction

Recent Activity

upvoted a collection 3 days ago

upvoted a collection 3 days ago

View all activity

Organizations

reach-vb's activity

upvoted 2 collections 3 days ago

Gemma 3 QAT

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory. • 19 items • Updated 3 days ago • 18

blt

4 items • Updated 4 days ago • 15

upvoted an article 5 days ago

Article

Cohere on Hugging Face Inference Providers 🔥

6 days ago

• 87

upvoted a collection 7 days ago

GLM-4-0414

GLM-4-0414 series model • 8 items • Updated 7 days ago • 104

upvoted a collection 12 days ago

Kimi-VL-A3B

Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 6 items • Updated 9 days ago • 61

upvoted an article 12 days ago

Article

Welcome Llama 4 Maverick & Scout on Hugging Face!

17 days ago

• 140

upvoted a paper 13 days ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published 14 days ago • 164

upvoted a collection 16 days ago

Llama 4

Llama 4 release • 10 items • Updated 16 days ago • 439

upvoted an article 17 days ago

Article

The NLP Course is becoming the LLM Course!

19 days ago

• 80

upvoted an article 21 days ago

Article

Xet is on the Hub

Mar 18

• 47

upvoted a paper 25 days ago

Gemma 3 Technical Report

Paper • 2503.19786 • Published 27 days ago • 47

upvoted a collection 25 days ago

FAIR's LayerSkip Llama models

7 items • Updated Dec 13, 2024 • 2

upvoted a collection 27 days ago

State of open code models (March 2025)

The best open code models on Hugging Face as of March 2025 • 7 items • Updated 27 days ago • 2

upvoted a collection about 1 month ago

MoshiVis v0.1

MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs • 8 items • Updated Mar 21 • 22

upvoted a paper about 1 month ago

Training and Inference Efficiency of Encoder-Decoder Speech Models

Paper • 2503.05931 • Published Mar 7 • 3

upvoted 2 collections about 1 month ago

Orpheus TTS

TTS Towards Human-Sounding Speech • 2 items • Updated Mar 18 • 58

Cosmos Transfer1

Multimodal Conditional World Generation for World2World Transfer • 5 items • Updated 7 days ago • 14

upvoted an article about 1 month ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Mar 12

• 392