David Quispe PRO

daqc

AI & ML interests

Education

Recent Activity

updated a dataset 2 days ago

somosnlp-hackathon-2025/reto-preferencias-v0-Peru-BLEnDCategories

published a dataset 2 days ago

somosnlp-hackathon-2025/reto-preferencias-v0-Peru-BLEnDCategories

updated a dataset 3 days ago

somosnlp-hackathon-2025/PreguntasHistoria-Peru-ExamenAdmisionUNMSM-MultipleChoice

View all activity

Organizations

daqc's activity

updated a dataset 2 days ago

somosnlp-hackathon-2025/reto-preferencias-v0-Peru-BLEnDCategories

Viewer • Updated 2 days ago • 119 • 47

published a dataset 2 days ago

somosnlp-hackathon-2025/reto-preferencias-v0-Peru-BLEnDCategories

Viewer • Updated 2 days ago • 119 • 47

updated a dataset 3 days ago

somosnlp-hackathon-2025/PreguntasHistoria-Peru-ExamenAdmisionUNMSM-MultipleChoice

Viewer • Updated 3 days ago • 449 • 144

published a dataset 3 days ago

somosnlp-hackathon-2025/PreguntasHistoria-Peru-ExamenAdmisionUNMSM-MultipleChoice

Viewer • Updated 3 days ago • 449 • 144

upvoted a paper 4 days ago

MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets

Paper • 2403.03194 • Published Mar 5, 2024 • 15

upvoted 2 articles 7 days ago

Article

NVIDIA's GTC 2025 Announcement for Physical AI Developers: New Open Models and Datasets

Mar 18

• 35

Article

Hugging Face to sell open-source robots thanks to Pollen Robotics acquisition 🤖

10 days ago

• 38

upvoted 2 articles 17 days ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Mar 12

• 398

Article

Welcome Llama 4 Maverick & Scout on Hugging Face!

19 days ago

• 140

liked 2 Spaces 19 days ago

134

Try YourBench!

🪄

Generate a custom benchmark from any document

YourBench Advanced Usage

🚀

Explore advanced functionalities in a clonable space

upvoted a paper 19 days ago

YourBench: Easy Custom Evaluation Sets for Everyone

Paper • 2504.01833 • Published 21 days ago • 20

upvoted 4 papers 27 days ago

Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation

Paper • 2412.03304 • Published Dec 4, 2024 • 19

reacted to lewtun's post with 🔥 27 days ago

Post

2421

Introducing OlympicCoder: a series of open reasoning models that can solve olympiad-level programming problems 🧑‍💻

- 7B open-r1/OlympicCoder-7B
- 32B open-r1/OlympicCoder-32B

We find that OlympicCoder models outperform Claude 3.7 Sonnet, as well as others over 100x larger 💪

Together with the models, we are releasing:

📊CodeForces-CoTs: new dataset of code problems from the most popular competitive coding platform, with R1 traces in C++ and Python open-r1/codeforces-cots

🏆 IOI'2024: a new benchmark of VERY hard programming problems where even frontier models struggle to match human performance open-r1/ioi

For links to the models and datasets, check out our latest progress report from Open R1: https://huggingface.co/blog/open-r1/update-3

1 reply

reacted to lewtun's post with ❤️ 27 days ago

Post

5139

Introducing OpenR1-Math-220k!

open-r1/OpenR1-Math-220k

The community has been busy distilling DeepSeek-R1 from inference providers, but we decided to have a go at doing it ourselves from scratch 💪

What’s new compared to existing reasoning datasets?

♾ Based on AI-MO/NuminaMath-1.5: we focus on math reasoning traces and generate answers for problems in NuminaMath 1.5, an improved version of the popular NuminaMath-CoT dataset.

🐳 800k R1 reasoning traces: We generate two answers for 400k problems using DeepSeek R1. The filtered dataset contains 220k problems with correct reasoning traces.

📀 512 H100s running locally: Instead of relying on an API, we leverage vLLM and SGLang to run generations locally on our science cluster, generating 180k reasoning traces per day.

⏳ Automated filtering: We apply Math Verify to only retain problems with at least one correct answer. We also leverage Llama3.3-70B-Instruct as a judge to retrieve more correct examples (e.g for cases with malformed answers that can’t be verified with a rules-based parser)

📊 We match the performance of DeepSeek-Distill-Qwen-7B by finetuning Qwen-7B-Math-Instruct on our dataset.

🔎 Read our blog post for all the nitty gritty details: https://huggingface.co/blog/open-r1/update-2