Mert Erbak's picture

Mert Erbak PRO

merterbak

AI & ML interests

NLP and Image Processing

Recent Activity

liked a model 2 days ago
deepseek-ai/DeepSeek-R1-0528
updated a Space 8 days ago
merterbak/Mistral-OCR
View all activity

Organizations

Open-Source AI Meetup's profile picture MLX Community's profile picture Social Post Explorers's profile picture Hugging Face Discord Community's profile picture open/ acc's profile picture AI Starter Pack's profile picture Hugging Face MCP Course's profile picture

merterbak's activity

reacted to clem's post with ๐Ÿ”ฅ 15 days ago
view post
Post
3116
Very cool to see pytorch contributing on Hugging Face. Time to follow them to see what they're cooking!
  • 2 replies
ยท
reacted to their post with ๐Ÿ”ฅ 18 days ago
posted an update 18 days ago
upvoted an article 19 days ago
view article
Article

Vision Language Models (Better, Faster, Stronger)

By merve and 4 others โ€ข
โ€ข 393
reacted to merve's post with ๐Ÿ”ฅ 19 days ago
view post
Post
5019
VLMS 2025 UPDATE ๐Ÿ”ฅ

We just shipped a blog on everything latest on vision language models, including
๐Ÿค– GUI agents, agentic VLMs, omni models
๐Ÿ“‘ multimodal RAG
โฏ๏ธ video LMs
๐Ÿค๐Ÿป smol models
..and more! https://huggingface.co/blog/vlms-2025
  • 1 reply
ยท
reacted to their post with ๐Ÿš€๐Ÿ”ฅ 20 days ago
view post
Post
2265
Seed-Coder released and it's designed for coding tasks, featuring base, instruct, and reasoning variants at an 8B parameter scale developed by ByteDance Seed team. Unlike traditional open source LLMs that rely on human crafted rules or annotated data for curating code pretraining datasets Seed-Coder introduces a model-centric data pipeline. The pipeline processes raw data from GitHub and web archives into four categories: file-level codes, repository-level codes, GitHub commits, and code-related web data.A quality filter LLM, evaluates code (for readability, modularity, clarity, and reusability) by removing the lowest 10% to create a 6 trillion token dataset supporting 89 programming languages.
Models: ByteDance-Seed/seed-coder-680de32c15ead6555c75b0e4
Github: https://github.com/ByteDance-Seed/Seed-Coder/tree/master
Paper: https://github.com/ByteDance-Seed/Seed-Coder/blob/master/Seed-Coder.pdf
posted an update 20 days ago
view post
Post
2265
Seed-Coder released and it's designed for coding tasks, featuring base, instruct, and reasoning variants at an 8B parameter scale developed by ByteDance Seed team. Unlike traditional open source LLMs that rely on human crafted rules or annotated data for curating code pretraining datasets Seed-Coder introduces a model-centric data pipeline. The pipeline processes raw data from GitHub and web archives into four categories: file-level codes, repository-level codes, GitHub commits, and code-related web data.A quality filter LLM, evaluates code (for readability, modularity, clarity, and reusability) by removing the lowest 10% to create a 6 trillion token dataset supporting 89 programming languages.
Models: ByteDance-Seed/seed-coder-680de32c15ead6555c75b0e4
Github: https://github.com/ByteDance-Seed/Seed-Coder/tree/master
Paper: https://github.com/ByteDance-Seed/Seed-Coder/blob/master/Seed-Coder.pdf