4 33 4

Samuel Arcadinho

SSamDav

SSamDav

AI & ML interests

None yet

Recent Activity

upvoted a paper 25 days ago

RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale

upvoted a paper about 1 month ago

Implicit Language Models are RNNs: Balancing Parallelization and Expressivity

upvoted a paper about 2 months ago

MIEB: Massive Image Embedding Benchmark

View all activity

Organizations

SSamDav's activity

upvoted a paper 25 days ago

RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale

Paper • 2505.03005 • Published 27 days ago • 31

upvoted a paper about 1 month ago

Implicit Language Models are RNNs: Balancing Parallelization and Expressivity

Paper • 2502.07827 • Published Feb 10 • 1

upvoted 4 papers about 2 months ago

commented a paper about 2 months ago

APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay

Paper • 2504.03601 • Published Apr 4 • 16 •

upvoted 2 papers 2 months ago

Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26 • 154

RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 148

upvoted 6 papers 3 months ago

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 164

Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning

Paper • 2503.07572 • Published Mar 10 • 45

Forgetting Transformer: Softmax Attention with a Forget Gate

Paper • 2503.02130 • Published Mar 3 • 32

EuroBERT: Scaling Multilingual Encoders for European Languages

Paper • 2503.05500 • Published Mar 7 • 80

SurveyX: Academic Survey Automation via Large Language Models

Paper • 2502.14776 • Published Feb 20 • 100

MoBA: Mixture of Block Attention for Long-Context LLMs

Paper • 2502.13189 • Published Feb 18 • 17

liked a Space 3 months ago

2.64k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

upvoted 2 collections 3 months ago

Dria-Agent-a

Collection

powerful agentic models built for pythonic function calling • 4 items • Updated Feb 14 • 4

Tiny-Agent-a

Collection

fast and powerful agentic models designed to run on edge devices. • 6 items • Updated Feb 12 • 7

commented 2 papers 4 months ago

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7 • 141 •

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7 • 141 •