9 15 4

HUANG SHAOHAN

buaahsh

AI & ML interests

None yet

Recent Activity

authored a paper 6 days ago

BitNet b1.58 2B4T Technical Report

liked a model 8 days ago

microsoft/bitnet-b1.58-2B-4T

authored a paper 3 months ago

GeAR: Generation Augmented Retrieval

View all activity

Organizations

buaahsh's activity

authored a paper 6 days ago

BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published 7 days ago • 64

liked a model 8 days ago

microsoft/bitnet-b1.58-2B-4T

Text Generation • Updated about 4 hours ago • 21.5k • 708

authored a paper 3 months ago

GeAR: Generation Augmented Retrieval

Paper • 2501.02772 • Published Jan 6 • 23

upvoted a paper 4 months ago

Multimodal Latent Language Modeling with Next-Token Diffusion

Paper • 2412.08635 • Published Dec 11, 2024 • 45

commented a paper 5 months ago

MH-MoE:Multi-Head Mixture-of-Experts

Paper • 2411.16205 • Published Nov 25, 2024 • 28 •

authored a paper 5 months ago

On Domain-Specific Post-Training for Multimodal Large Language Models

Paper • 2411.19930 • Published Nov 29, 2024 • 29

upvoted a paper 5 months ago

On Domain-Specific Post-Training for Multimodal Large Language Models

Paper • 2411.19930 • Published Nov 29, 2024 • 29

liked a model 5 months ago

AdaptLLM/Adapt-MLLM-to-Domains

Updated Mar 21 • 11

authored a paper 5 months ago

MH-MoE:Multi-Head Mixture-of-Experts

Paper • 2411.16205 • Published Nov 25, 2024 • 28

upvoted a paper 5 months ago

MH-MoE:Multi-Head Mixture-of-Experts

Paper • 2411.16205 • Published Nov 25, 2024 • 28

commented a paper 5 months ago

MH-MoE:Multi-Head Mixture-of-Experts

Paper • 2411.16205 • Published Nov 25, 2024 • 28 •

upvoted 2 papers 9 months ago

Multi-Head Mixture-of-Experts

Paper • 2404.15045 • Published Apr 23, 2024 • 61

E5-V: Universal Embeddings with Multimodal Large Language Models

Paper • 2407.12580 • Published Jul 17, 2024 • 41

authored a paper 10 months ago

Instruction Pre-Training: Language Models are Supervised Multitask Learners

Paper • 2406.14491 • Published Jun 20, 2024 • 94

upvoted a paper 10 months ago

Instruction Pre-Training: Language Models are Supervised Multitask Learners

Paper • 2406.14491 • Published Jun 20, 2024 • 94

upvoted a paper 11 months ago

MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20, 2024 • 51

liked a Space 11 months ago

919

FineWeb: decanting the web for the finest text data at scale

🍷

Generate high-quality web text data for LLM training

authored a paper 11 months ago

MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20, 2024 • 51

authored a paper 12 months ago

Multi-Head Mixture-of-Experts

Paper • 2404.15045 • Published Apr 23, 2024 • 61

upvoted a paper about 1 year ago

Retentive Network: A Successor to Transformer for Large Language Models

Paper • 2307.08621 • Published Jul 17, 2023 • 170