John6666 (John Smith)

reacted to Kseniase's post with 👍 about 15 hours ago

Post

1503

11 new types of RAG

RAG is evolving fast, keeping pace with cutting-edge AI trends. Today it becomes more agentic and smarter at navigating complex structures like hypergraphs.

Here are 11 latest RAG types:

1. InstructRAG -> InstructRAG: Leveraging Retrieval-Augmented Generation on Instruction Graphs for LLM-Based Task Planning (2504.13032)
Combines RAG with a multi-agent framework, using a graph-based structure, an RL agent to expand task coverage, and a meta-learning agent for better generalization

2. CoRAG (Collaborative RAG) -> CoRAG: Collaborative Retrieval-Augmented Generation (2504.01883)
A collaborative framework that extends RAG to settings where clients train a shared model using a joint passage store

3. ReaRAG -> ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation (2503.21729)
It uses a Thought-Action-Observation loop to decide at each step whether to retrieve information or finalize an answer, reducing unnecessary reasoning and errors

4. MCTS-RAG -> MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search (2503.20757)
Combines RAG with Monte Carlo Tree Search (MCTS) to help small LMs handle complex, knowledge-heavy tasks

5. Typed-RAG - > Typed-RAG: Type-aware Multi-Aspect Decomposition for Non-Factoid Question Answering (2503.15879)
Improves answers on open-ended questions by identifying question types (a debate, personal experience, or comparison) and breaking it down into simpler parts

6. MADAM-RAG -> Retrieval-Augmented Generation with Conflicting Evidence (2504.13079)
A multi-agent system where models debate answers over multiple rounds and an aggregator filters noise and misinformation

7. HM-RAG -> HM-RAG: Hierarchical Multi-Agent Multimodal Retrieval Augmented Generation (2504.12330)
A hierarchical multi-agent RAG framework that uses 3 agents: one to split queries, one to retrieve across multiple data types (text, graphs and web), and one to merge and refine answers

8. CDF-RAG -> CDF-RAG: Causal Dynamic Feedback for Adaptive Retrieval-Augmented Generation (2504.12560)
Works with causal graphs and enables multi-hop causal reasoning, refining queries. It validates responses against causal pathways

To explore what is Causal AI, read our article: https://www.turingpost.com/p/causalai

Subscribe to the Turing Post: https://www.turingpost.com/subscribe

Read further 👇

1 reply

·

reacted to eaddario's post with 🔥 about 22 hours ago

Post

942

Tensor-wise (TWQ) and Layer-wise quantization (LWQ) now available in llama.cpp!

As of version b5125 users can now do TWQ, whereby you quantize a whole tensor at a specific level, or perform LWQ by choosing specific layers per tensor/s

The new --tensor-type option enables llama-quantize to apply user-defined quant levels to any combination of allowed tensors (i.e. tensors with 2 or more dimensions) and layer number, with support for regex patterns.

For example, to TWQ the Attention Value tensor you would use --tensor-type attn_v=q6_k and to perform LWQ you'll use something like --tensor-type "\.([0-9]|1[01257]|31)\.attn_v=q4_k"

In the next few days/weeks I'll update the models in my HF repo (and will add some others) but eaddario/DeepSeek-R1-Distill-Llama-8B-GGUF and eaddario/DeepSeek-R1-Distill-Qwen-7B-GGUF have been already LWQed.

For reference, compared to the naive Q4_K_M model, the LWQ Qwen-7B is almost 11% smaller (4.68GB vs 4.18GB) with only a 0.35% penalty on PPL!

I'll update the https://medium.com/@eaddario/squeezing-tensor-bits-the-quest-for-smaller-llms-86b23bd052ca post to explain the process in detail, but in the meantime the following links will provide some background:

- Changes to llama-quantize: https://github.com/ggml-org/llama.cpp/pull/12511
- TWQ & LWQ tests: https://github.com/ggml-org/llama.cpp/discussions/12741
- Modified llama-imatrix (not yet merged) used to generate imatrix statistics to guide the TWQ and LWQ process: https://github.com/ggml-org/llama.cpp/pull/12718

reacted to openfree's post with 🔥 about 22 hours ago

Post

2980

🧠 ThinkFlow: The Revolutionary Platform That Gives LLMs the Power to Think 🚀

Hello AI community! We're excited to introduce you to ThinkFlow, an innovative service that transforms how language models solve problems. 🎉
VIDraft/ThinkFlow-llama

✨ What is ThinkFlow?
ThinkFlow is a groundbreaking platform that automatically applies step-by-step reasoning capabilities to existing LLM models without any modifications. It makes complex problem-solving transparent, allowing you to witness the model's thought process in real-time.

🔍 Key Features

Reasoning Without Model Modifications: Add step-by-step reasoning while utilizing existing LLMs as they are ⚙️
Visualized Thinking Process: See exactly how the model analyzes and solves problems 👁️
Before & After Comparison: Compare standard responses with reasoning-enhanced outputs in real-time 📊
Improved Accuracy: Deliver more accurate solutions for complex math and logic problems 📈
Educational Value: Teach students systematic approaches to problem-solving 👨‍🏫
User-Friendly Interface: Intuitive and easy-to-use UI for seamless experience 🖥️

💡 What Problems Can It Solve?
ThinkFlow is particularly effective for various domains including:

Complex mathematical problems 🧮
Logic puzzles 🧩
Questions requiring multi-step reasoning 🤔
Scientific analysis challenges 🔬
Complex decision-making processes 📝

👨‍💻 Technical Details
ThinkFlow is built on the meta-llama/Llama-3.1-8B-Instruct model and uses carefully designed prompt chains to guide the model through step-by-step thinking. Each reasoning step builds upon the results of previous steps, culminating in a comprehensive final answer.

💬 Join Our Community!
If you have questions or suggestions about ThinkFlow, join our Discord community: https://discord.gg/openfreeai
Let's build better AI reasoning experiences together! 💪

#AI #LLM #ReasoningAI #ThinkFlow #HuggingFace #OpenSource #AIEducation

9 replies

·

reacted to Beegbrain's post with 🚀 about 22 hours ago

Post

477

Hello, I've just written an article explaining the project I've made with my team at the Mistral AI Robotic Hackathon one week ago : https://huggingface.co/blog/Beegbrain/guess-who-so100-mistral ; Feel free to take a look, we are open-sourcing the code and begin to launch a community project around the idea, reach out to participate

1 reply

·

reacted to zhiminy's post with 🚀 about 22 hours ago

Post

1048

# 🚀 SE Arena: Evaluating Foundation Models for Software Engineering

**SE Arena** is the first open-source platform for evaluating foundation models in real-world software engineering workflows.

## What makes it unique?

- **RepoChat**: Automatically injects repository context (issues, commits, PRs) into conversations for more realistic evaluations
- **Multi-round interactions**: Tests models through iterative workflows, not just single prompts
- **Novel metrics**: Includes a "consistency score" that measures model determinism through self-play matches

Try it now: SE-Arena/Software-Engineering-Arena

## Why it matters

Traditional evaluation frameworks don't capture how developers actually use models in their daily work. SE Arena creates a testing environment that mirrors real engineering workflows, helping you choose the right model for your specific software development needs.

From debugging to requirement refinement, see which models truly excel at software engineering tasks!

reacted to MonsterMMORPG's post with 👀 about 22 hours ago

Post

807

Tencent InstantCharacter 1-Click Installers for Windows, RunPod and Massed Compute, Supports RTX 5000 series as well

Latest installer zip file : https://www.patreon.com/posts/126995127

Use above link to get installer zip file

Official repo : https://github.com/Tencent/InstantCharacter
I have significantly improved the official Repo app
Put FLUX LoRAs into loras folder, it will download 3 LoRAs by default
It will download necessary models into models folder automatically
Lower Character Scale makes it more stylized like 0.6, 0.8 etc
Also official repo Gradio was completely broken, fixed, improved, added new features like automatically save every generated image, number of generations and more
Currently you need min 48GB GPUs, I am trying to make it work with lower VRAM via quantization

2 replies

·

reacted to JLouisBiz's post with 🔥 about 22 hours ago

Post

1045

Back to LLM integration.

ClickDefine.sh -- quickly define or explain anything within your whole desktop environment

You only need to run the model locally, maybe with the **llama.cpp** or **ollama**

- https://github.com/ggml-org/llama.cpp
- https://ollama.com/download

And you get universal explaining tool that works anywhere on your X Org Desktop (on operating systems which are usually Fully Free Software like Debian GNU/Linux)

ClickDefine - Interactive Text Processor Script for Iterative LLM Query Handling:
https://hyperscope.link/9/6/0/9/8/ClickDefine-Interactive-Text-Processor-Script-for-Iterative-LLM-Query-Handling-96098.html

Watch the demonstration here: https://www.youtube.com/watch?v=mQxCYAiReu0&t=2s

reacted to onekq's post with 👀 about 22 hours ago

Post

215

This post discussed the same trend as the Sutton post, but is more concrete and down-to-earth.

https://ysymyth.github.io/The-Second-Half/

Two takeaways for me. (1) deep neural network is the backbone to unify everything. RLHF will stand the test of time because it brings two distinct fields (NLP and RL) onto the same model weights. (2) language model will continue to play a central role in the era of agent. It probably won't be the end game to AGI, but definitely not offramp.

reacted to ginipick's post with 🔥 about 22 hours ago

Post

1138

🤖 AI Academic Paper Generator: Your Research Partner 🎓

Hello, researchers! Today I'm introducing my AI Academic Paper Generation System. This application is built with Streamlit and provides AI agents to assist with every stage of the academic research process.

ginipick/AgentX-Papers

✨ Key Features

📚 Literature Research: AI reviews and summarizes relevant research
📝 Paper Outline: Generates a well-structured paper outline
✍️ Draft Writing: Creates a paper draft based on your research topic
🔗 Citation Generation: Automatically generates academic citations
🖋️ Editing & Polishing: Checks grammar, context, and logical flow
🌐 Multilingual Support: Interface available in English and Korean

🚀 How to Use

Enter basic information like research topic, paper title, and deadline
AI agents generate everything from literature review to final paper
Download your completed paper or consult with the chatbot for further assistance

💡 What Makes It Special
This tool integrates all stages of academic research. Going beyond simple text generation, it mimics the actual research process to produce higher quality papers.
Visualization features and social media sharing options will be added in the next update! 💪

#AIResearch #AcademicWriting #ResearchAssistant #ArtificialIntelligence

reacted to aiqtech's post with 🔥 about 22 hours ago

Post

1344

🌐 AI Token Visualization Tool with Perfect Multilingual Support

Hello! Today I'm introducing my Token Visualization Tool with comprehensive multilingual support. This web-based application allows you to see how various Large Language Models (LLMs) tokenize text.

aiqtech/LLM-Token-Visual

✨ Key Features

🤖 Multiple LLM Tokenizers: Support for Llama 4, Mistral, Gemma, Deepseek, QWQ, BERT, and more
🔄 Custom Model Support: Use any tokenizer available on HuggingFace
📊 Detailed Token Statistics: Analyze total tokens, unique tokens, compression ratio, and more
🌈 Visual Token Representation: Each token assigned a unique color for visual distinction
📂 File Analysis Support: Upload and analyze large files

🌏 Powerful Multilingual Support
The most significant advantage of this tool is its perfect support for all languages:

📝 Asian languages including Korean, Chinese, and Japanese fully supported
🔤 RTL (right-to-left) languages like Arabic and Hebrew supported
🈺 Special characters and emoji tokenization visualization
🧩 Compare tokenization differences between languages
💬 Mixed multilingual text processing analysis

🚀 How It Works

Select your desired tokenizer model (predefined or HuggingFace model ID)
Input multilingual text or upload a file for analysis
Click 'Analyze Text' to see the tokenized results
Visually understand how the model breaks down various languages with color-coded tokens

💡 Benefits of Multilingual Processing
Understanding multilingual text tokenization patterns helps you:

Optimize prompts that mix multiple languages
Compare token efficiency across languages (e.g., English vs. Korean vs. Chinese token usage)
Predict token usage for internationalization (i18n) applications
Optimize costs for multilingual AI services

🛠️ Technology Stack

Backend: Flask (Python)
Frontend: HTML, CSS, JavaScript (jQuery)
Tokenizers: 🤗 Transformers library

reacted to seawolf2357's post with 🔥 about 22 hours ago

Post

1733

📚 Papers Leaderboard - See the Latest AI Research Trends at a Glance! ✨

Hello, AI research community! Today I'm introducing a new tool for exploring research papers. Papers Leaderboard is an open-source dashboard that makes it easy to find and filter the latest AI research papers.

Heartsync/Papers-Leaderboard

🌟 Key Features

Date Filtering: View only papers published within a specific timeframe (from May 5, 2023 to present)
Title Search: Quickly find papers containing your keywords of interest
Abstract Search: Explore paper content more deeply by searching for keywords within abstracts
Automatic Updates: The database is updated with the latest papers every hour

💡 How to Use It?

Select a start date and end date
Enter keywords you want to find in titles or abstracts
Adjust the maximum number of search results for abstract searches
Results are displayed neatly in table format

reacted to merterbak's post with 👀 2 days ago

Post

1704

Here’s a cool paper I found: “Massive Image Embedding Benchmark (MIEB).” It is a new tool to test how good image embedding models are. It has 130 different tasks grouped into 8 categories, like image search, classification, clustering similar images, answering questions based on images, and understanding documents. It even covers 38 different languages.

The authors tested 50 models and found that no single model was best at everything. Some models were great at recognizing text inside images but struggled to handle complicated tasks like matching images and text that appear together.

Paper: https://arxiv.org/pdf/2504.10471v1
Code: https://github.com/embeddings-benchmark/mteb

2 replies

·

reacted to nyuuzyou's post with 👍 2 days ago

Post

1529

🦅 SmolLM2-Eagle Collection - nyuuzyou/smollm2-eagle-680263bf97f0c7e6bbe4936b

Collection of fine-tuned bilingual language models featuring:
- Models in three parameter sizes: 135M, 360M, and 1.7B based on HuggingFaceTB's SmolLM2 models
- Both standard and GGUF formats for flexible deployment in llama.cpp and Ollama
- Fine-tuned on nyuuzyou/EagleSFT dataset (536,231 Russian-English QA pairs derived from 739k+ real user queries)
- Experimental Russian language capabilities while maintaining English performance
- Limited Russian capabilities due to SFT-only approach without Russian pre-training
- Environmental impact: ~19.75 kg CO2eq

This collection provides compact models for research on bilingual language capabilities, resource-constrained environments, and educational applications. Not recommended for production use due to experimental nature and inherent limitations. Available under Apache 2.0 license.

1 reply

·

replied to educrpg's post 2 days ago

Same there... with Endpoint (dedicated) also...
https://discuss.huggingface.co/t/my-space-suddenly-went-offline-the-cpu-cannot-restart/151121
https://discuss.huggingface.co/t/501-unauthorized-error/151251
https://discuss.huggingface.co/t/error-400-when-i-update-endpoints-to-lastest-version/151229

It seems that hysts contacted HF internally for now.

reacted to educrpg's post with 🔥 2 days ago

Post

1717

anyone have all their spaces stuck in building now?

3 replies

·

reacted to onekq's post with 🔥 2 days ago

Post

1414

This is bitter lesson 2.0
https://storage.googleapis.com/deepmind-media/Era-of-Experience%20/The%20Era%20of%20Experience%20Paper.pdf

If this reads too lofty to you, consider some low-hanging fruits. Experiences here are reward signals we send to LLMs, e.g. human score in RLHF, verification in AlphaProof, or test results for code generation.

RFT (reinforced finetuning) will become main stream, and IMO make LLMs behave more like agents.

1 reply

·

reacted to samuellimabraz's post with 🤗 2 days ago

Post

1499

I recently had the opportunity to present at a Computer Vision Hangout, sharing my journey from autonomous drone competition to fine-tuning Vision-Language Models.

I built an interactive presentation app! Here's a glimpse of the topics:

🚁 Black Bee Drones:
My first steps into CV with Latin America's first autonomous drone team. Covering classical CV techniques (filtering, edge detection), the IMAV 2023 mission (ArUco detection, line following with PID control), and links to demos for OpenCV basics and PID simulation.

🤖 Asimo Foundation:
Using MediaPipe for gesture control of a robotic arm in an educational project.

☕ CafeDL:
Building a small Deep Learning framework from scratch in Java (inspired by Keras, using ND4J) and training a CNN for a QuickDraw-like app.

🏢 Tech4Humans:
Real-world applications, including open-source signature detection and efficient fine-tuning of VLMs for document extraction.

Check out the interactive demos (also embedded in the main app):

1️⃣ CV Hangout App: The main presentation app showcasing my journey.
samuellimabraz/cv-hangout

2️⃣ OpenCV GUI: Real-time demo of CV techniques (filters, color filtering, ArUco) & AI models.
samuellimabraz/opencv-gui

3️⃣ Line Follow PID: Simulation of a PID controller for drone line-following.
samuellimabraz/line-follow-pid

I hope these resources are helpful to someone on their CV learning journey!

reacted to Reality123b's post with 👍 2 days ago

Post

451

Lap1official/Curated-Reasoning
made a new dataset.
this is a curated hybrid reasoning dataset.
maybe the first on the hub.

reacted to davidberenstein1957's post with 🧠 3 days ago

Post

1533

🧑‍🏫 I wrote a brief blogpost to give An Introduction to AI Model Optimization Techniques!

URL: https://huggingface.co/blog/PrunaAI/introduction-to-ai-model-optimization-techniques

reacted to philschmid's post with 🔥 3 days ago

Post

1822

Gemini 2.5 Flash is here! We excited launch our first hybrid reasoning Gemini model. In Flash 2.5 developer can turn thinking off.

**TL;DR:**
- 🧠 Controllable "Thinking" with thinking budget with up to 24k token
- 🌌 1 Million multimodal input context for text, image, video, audio, and pdf
- 🛠️ Function calling, structured output, google search & code execution.
- 🏦 $0.15 1M input tokens; $0.6 or $3.5 (thinking on) per million output tokens (thinking tokens are billed as output tokens)
- 💡 Knowledge cut of January 2025
- 🚀 Rate limits - Free 10 RPM 500 req/day
- 🏅Outperforms 2.0 Flash on every benchmark

Try it ⬇️
https://aistudio.google.com/prompts/new_chat?model=gemini-2.5-flash-preview-04-17

1 reply

·

John Smith PRO

AI & ML interests

Recent Activity

Organizations

John6666's activity