Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
3
Jiarui Yao
FlippyDora
Follow
0 followers
·
9 following
AI & ML interests
None yet
Recent Activity
updated
a model
1 day ago
ScaleML-RLHF/Qwen2.5-Math-7B-raft-plusplus-numina_math_em-cliphigher0.35-n8-8-iter1
published
a model
1 day ago
ScaleML-RLHF/Qwen2.5-Math-7B-raft-plusplus-numina_math_em-cliphigher0.35-n8-8-iter1
updated
a model
2 days ago
ScaleML-RLHF/Qwen2.5-Math-7B-raftpp-cliphigher-n8-step130
View all activity
Organizations
FlippyDora
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
upvoted
2 papers
5 days ago
OTC: Optimal Tool Calls via Reinforcement Learning
Paper
•
2504.14870
•
Published
6 days ago
•
31
ToolRL: Reward is All Tool Learning Needs
Paper
•
2504.13958
•
Published
10 days ago
•
39
upvoted
a
paper
about 2 months ago
Self-rewarding correction for mathematical reasoning
Paper
•
2502.19613
•
Published
Feb 26
•
84