Today is a huge day in Argilla’s history. We couldn’t be more excited to share this with the community: we’re joining Hugging Face!
We’re embracing a larger mission, becoming part of a brilliant and kind team and a shared vision about the future of AI.
Over the past year, we’ve been collaborating with Hugging Face on countless projects: launching partner of Docker Spaces, empowering the community to clean Alpaca translations into Spanish and other languages, launching argilla/notus-7b-v1 building on Zephyr’s learnings, the Data is Better Together initiative with hundreds of community contributors, or releasing argilla/OpenHermesPreferences, one of the largest open preference tuning datasets
After more than 2,000 Slack messages and over 60 people collaborating for over a year, it already felt like we were part of the same team, pushing in the same direction. After a week of the smoothest transition you can imagine, we’re now the same team.
To those of you who’ve been following us, this won’t be a huge surprise, but it will be a big deal in the coming months. This acquisition means we’ll double down on empowering the community to build and collaborate on high quality datasets, we’ll bring full support for multimodal datasets, and we’ll be in a better place to collaborate with the Open Source AI community. For enterprises, this means that the Enterprise Hub will unlock highly requested features like single sign-on and integration with Inference Endpoints.
As a founder, I am proud of the Argilla team. We're now part of something bigger and a larger team but with the same values, culture, and goals. Grateful to have shared this journey with my beloved co-founders Paco and Amélie.
Finally, huge thanks to the Chief Llama Officer @osanseviero for sparking this and being such a great partner during the acquisition process.
Would love to answer any questions you have so feel free to add them below!
Very excited to share the first two official Gemma variants from Google! Today at Google Cloud Next, we announced cutting-edge models for code and research!
First, google/codegemma-release-66152ac7b683e2667abdee11 - a new set of code-focused Gemma models at 2B and 7B, in both pretrained and instruction-tuned variants. These exhibit outstanding performance on academic benchmarks and (in my experience) real-life usage. Read more in the excellent HuggingFace blog: https://huggingface.co/blog/codegemma
Second, (google/recurrentgemma-release-66152cbdd2d6619cb1665b7a), which is based on the outstanding Google DeepMind research in Griffin: https://arxiv.org/abs/2402.19427. RecurrentGemma is a research variant that enables higher throughput and vastly improved memory usage. We are excited about new architectures, especially in the lightweight Gemma sizes, where innovations like RecurrentGemma can scale modern AI to many more use cases.
For details on the launches of these models, check out our launch blog -- and please do not hesitate to send us feedback. We are excited to see what you build with CodeGemma and RecurrentGemma!
Huge thanks to the Hugging Face team for helping ensure that these models work flawlessly in the Hugging Face ecosystem at launch!
3 replies
·
reacted to loubnabnl's
post with ❤️🤗about 1 year ago
⭐ Today we’re releasing The Stack v2 & StarCoder2: a series of 3B, 7B & 15B code generation models trained on 3.3 to 4.5 trillion tokens of code:
- StarCoder2-15B matches or outperforms CodeLlama 34B, and approaches DeepSeek-33B on multiple benchmarks. - StarCoder2-3B outperforms StarCoderBase-15B and similar sized models. - The Stack v2 a 4x larger dataset than the Stack v1, resulting in 900B unique code tokens 🚀 As always, we released everything from models and datasets to curation code. Enjoy!
@HugoLaurencon@Leyo & @VictorSanh are introducing HuggingFaceM4/WebSight , a multimodal dataset featuring 823,000 pairs of synthetically generated HTML/CSS codes along with screenshots of the corresponding rendered websites to train GPT4-V-like models 🌐💻
While crafting their upcoming foundation vision language model, they faced the challenge of converting website screenshots into usable HTML/CSS codes. Most VLMs suck at this and there was no public dataset available for this specific task, so they decided to create their own.
They prompted existing LLMs to generate 823k HTML/CSS codes of very simple websites. Through supervised fine-tuning of a vision language model on WebSight, they were able to generate the code to reproduce a website component, given a screenshot.