nyuuzyou's picture

nyuuzyou PRO

nyuuzyou

AI & ML interests

None yet

Recent Activity

updated a dataset 20 minutes ago
nyuuzyou/OpenGameArt-Mixed-Licenses
published a dataset 33 minutes ago
nyuuzyou/OpenGameArt-Mixed-Licenses
View all activity

Organizations

Social Post Explorers's profile picture AI Starter Pack's profile picture

Posts 60

view post
Post
1654
🦅 SmolLM2-Eagle Collection - nyuuzyou/smollm2-eagle-680263bf97f0c7e6bbe4936b

Collection of fine-tuned bilingual language models featuring:
- Models in three parameter sizes: 135M, 360M, and 1.7B based on HuggingFaceTB's SmolLM2 models
- Both standard and GGUF formats for flexible deployment in llama.cpp and Ollama
- Fine-tuned on nyuuzyou/EagleSFT dataset (536,231 Russian-English QA pairs derived from 739k+ real user queries)
- Experimental Russian language capabilities while maintaining English performance
- Limited Russian capabilities due to SFT-only approach without Russian pre-training
- Environmental impact: ~19.75 kg CO2eq

This collection provides compact models for research on bilingual language capabilities, resource-constrained environments, and educational applications. Not recommended for production use due to experimental nature and inherent limitations. Available under Apache 2.0 license.
view post
Post
1677
🦅 EagleSFT Dataset - nyuuzyou/EagleSFT

Collection of 536,231 question-answer pairs featuring:

- Human-posed questions and machine-generated responses for SFT
- Bilingual content in Russian and English with linked IDs
- Derived from 739k+ real user queries, primarily educational topics
- Includes unique IDs and machine-generated category labels

This dataset provides a resource for supervised fine-tuning (SFT) of large language models, cross-lingual research, and understanding model responses to diverse user prompts. Released to the public domain under CC0 1.0 license.