jiakai's picture

219 813

jiakai

real-jiakai

·

https://blog.gujiakai.top

AI & ML interests

LLM && Smart QA

Recent Activity

upvoted a collection about 3 hours ago

Hallucination detection

liked a model about 4 hours ago

microsoft/mineworld

View all activity

Organizations

real-jiakai's activity

upvoted a collection about 3 hours ago

Hallucination detection

Trained ModernBERT (base and large) for detection hallucinations in LLM responses. The models are trained as token classifications. • 4 items • Updated Mar 5 • 16

upvoted a collection 3 days ago

MAI-DS-R1

MAI-DS-R1 is a DeepSeek-R1 reasoning model that has been post-trained by the Microsoft AI team. • 2 items • Updated 3 days ago • 7

upvoted 2 articles 3 days ago

Article

Introducing HELMET

5 days ago

• 19

Article

Hugging Face to sell open-source robots thanks to Pollen Robotics acquisition 🤖

7 days ago

• 36

upvoted a paper 3 days ago

ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness

Paper • 2504.10514 • Published 10 days ago • 45

upvoted 2 papers 4 days ago

Have we unified image generation and understanding yet? An empirical study of GPT-4o's image generation ability

Paper • 2504.08003 • Published 11 days ago • 45

TextArena

Paper • 2504.11442 • Published 5 days ago • 25

upvoted a paper 5 days ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published 6 days ago • 228

upvoted a collection 5 days ago

InternVL3

34 items • Updated about 20 hours ago • 51

upvoted an article 5 days ago

Article

4M Models Scanned: Protect AI + Hugging Face 6 Months In

7 days ago

• 24

upvoted a collection 5 days ago

DataDecide

A suite of models, data, and evals over 25 corpora, 14 sizes, and 3 seeds to measure how accurately small experiments predict rankings at large scale. • 358 items • Updated 4 days ago • 11

upvoted a paper 6 days ago

SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning

Paper • 2504.08600 • Published 10 days ago • 25

upvoted a collection 7 days ago

GLM-4-0414

GLM-4-0414 series model • 8 items • Updated 6 days ago • 102

upvoted a paper 7 days ago

OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

Paper • 2504.07096 • Published 11 days ago • 70

upvoted 2 collections 7 days ago

Orpheus Multilingual Research Release

Beta Release of multilingual models. • 12 items • Updated 10 days ago • 76

Skywork-OR1

Skywork Open Reasoner 1 • 8 items • Updated 8 days ago • 21

upvoted 2 papers 9 days ago

Kimi-VL Technical Report

Paper • 2504.07491 • Published 11 days ago • 114

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published 19 days ago • 80