Fishtiks (Joseph Robert Turcotte)

reacted to openfree's post with 🔥 1 day ago

Post

3954

🧠 ThinkFlow: The Revolutionary Platform That Gives LLMs the Power to Think 🚀

Hello AI community! We're excited to introduce you to ThinkFlow, an innovative service that transforms how language models solve problems. 🎉
VIDraft/ThinkFlow-llama

✨ What is ThinkFlow?
ThinkFlow is a groundbreaking platform that automatically applies step-by-step reasoning capabilities to existing LLM models without any modifications. It makes complex problem-solving transparent, allowing you to witness the model's thought process in real-time.

🔍 Key Features

Reasoning Without Model Modifications: Add step-by-step reasoning while utilizing existing LLMs as they are ⚙️
Visualized Thinking Process: See exactly how the model analyzes and solves problems 👁️
Before & After Comparison: Compare standard responses with reasoning-enhanced outputs in real-time 📊
Improved Accuracy: Deliver more accurate solutions for complex math and logic problems 📈
Educational Value: Teach students systematic approaches to problem-solving 👨‍🏫
User-Friendly Interface: Intuitive and easy-to-use UI for seamless experience 🖥️

💡 What Problems Can It Solve?
ThinkFlow is particularly effective for various domains including:

Complex mathematical problems 🧮
Logic puzzles 🧩
Questions requiring multi-step reasoning 🤔
Scientific analysis challenges 🔬
Complex decision-making processes 📝

👨‍💻 Technical Details
ThinkFlow is built on the meta-llama/Llama-3.1-8B-Instruct model and uses carefully designed prompt chains to guide the model through step-by-step thinking. Each reasoning step builds upon the results of previous steps, culminating in a comprehensive final answer.

💬 Join Our Community!
If you have questions or suggestions about ThinkFlow, join our Discord community: https://discord.gg/openfreeai
Let's build better AI reasoning experiences together! 💪

#AI #LLM #ReasoningAI #ThinkFlow #HuggingFace #OpenSource #AIEducation

9 replies

·

reacted to aiqtech's post with 🔥 1 day ago

Post

2657

🌐 AI Token Visualization Tool with Perfect Multilingual Support

Hello! Today I'm introducing my Token Visualization Tool with comprehensive multilingual support. This web-based application allows you to see how various Large Language Models (LLMs) tokenize text.

aiqtech/LLM-Token-Visual

✨ Key Features

🤖 Multiple LLM Tokenizers: Support for Llama 4, Mistral, Gemma, Deepseek, QWQ, BERT, and more
🔄 Custom Model Support: Use any tokenizer available on HuggingFace
📊 Detailed Token Statistics: Analyze total tokens, unique tokens, compression ratio, and more
🌈 Visual Token Representation: Each token assigned a unique color for visual distinction
📂 File Analysis Support: Upload and analyze large files

🌏 Powerful Multilingual Support
The most significant advantage of this tool is its perfect support for all languages:

📝 Asian languages including Korean, Chinese, and Japanese fully supported
🔤 RTL (right-to-left) languages like Arabic and Hebrew supported
🈺 Special characters and emoji tokenization visualization
🧩 Compare tokenization differences between languages
💬 Mixed multilingual text processing analysis

🚀 How It Works

Select your desired tokenizer model (predefined or HuggingFace model ID)
Input multilingual text or upload a file for analysis
Click 'Analyze Text' to see the tokenized results
Visually understand how the model breaks down various languages with color-coded tokens

💡 Benefits of Multilingual Processing
Understanding multilingual text tokenization patterns helps you:

Optimize prompts that mix multiple languages
Compare token efficiency across languages (e.g., English vs. Korean vs. Chinese token usage)
Predict token usage for internationalization (i18n) applications
Optimize costs for multilingual AI services

🛠️ Technology Stack

Backend: Flask (Python)
Frontend: HTML, CSS, JavaScript (jQuery)
Tokenizers: 🤗 Transformers library

3 replies

·

reacted to JLouisBiz's post with 🔥 1 day ago

Post

1962

Back to LLM integration.

ClickDefine.sh -- quickly define or explain anything within your whole desktop environment

You only need to run the model locally, maybe with the **llama.cpp** or **ollama**

- https://github.com/ggml-org/llama.cpp
- https://ollama.com/download

And you get universal explaining tool that works anywhere on your X Org Desktop (on operating systems which are usually Fully Free Software like Debian GNU/Linux)

ClickDefine - Interactive Text Processor Script for Iterative LLM Query Handling:
https://hyperscope.link/9/6/0/9/8/ClickDefine-Interactive-Text-Processor-Script-for-Iterative-LLM-Query-Handling-96098.html

Watch the demonstration here: https://www.youtube.com/watch?v=mQxCYAiReu0&t=2s

reacted to onekq's post with 🔥 3 days ago

Post

1454

This is bitter lesson 2.0
https://storage.googleapis.com/deepmind-media/Era-of-Experience%20/The%20Era%20of%20Experience%20Paper.pdf

If this reads too lofty to you, consider some low-hanging fruits. Experiences here are reward signals we send to LLMs, e.g. human score in RLHF, verification in AlphaProof, or test results for code generation.

RFT (reinforced finetuning) will become main stream, and IMO make LLMs behave more like agents.

2 replies

·

reacted to educrpg's post with 🔥 3 days ago

Post

1772

anyone have all their spaces stuck in building now?

3 replies

·

reacted to freddyaboulton's post with ❤️ 7 days ago

Post

1585

Ever wanted to share your AI creations with friends? ✨

Screenshots are fine, but imagine letting others play with your ACTUAL model!

Introducing Gradio deep links 🔗 - now you can share interactive AI apps, not just images.

Add a gr.DeepLinkButton to any app and get shareable URLs that let ANYONE experiment with your models.

reacted to openfree's post with 🔥 7 days ago

Post

8311

Agentic AI Era: Analyzing MCP vs MCO 🚀

Hello everyone!
With the rapid advancement of AI agent technology, two architectures have come into the spotlight: MCP (Model Context Protocol) and MCO (Model Context Open-json). Today, we’ll introduce the key features and differences of these two approaches.

VIDraft/Agentic-AI-CHAT

MCP: The Traditional Approach 🏛️
Centralized Function Registry: All functions are hardcoded into the core system.

Static Function Definitions & Tight Coupling: New features require changes to the core application code, limiting scalability.

Monolithic Design: Complex deployment and version management can cause a single error to affect the whole system.

Code Example:
'''py
FUNCTION_REGISTRY = {
"existing_function": existing_function,
"new_function": new_function # Adding a new function
}
'''

MCO: A Revolutionary Approach 🆕
JSON-based Function Definitions: Function details are stored in external JSON files, enabling dynamic module loading.

Loose Coupling & Microservices: Each function can be developed, tested, and deployed as an independent module.

Flexible Scalability: Add new features by simply updating the JSON and module files, without modifying the core system.

JSON Example:
[
{
"name": "analyze_sentiment",
"module_path": "nlp_tools",
"func_name_in_module": "sentiment_analysis",
"example_usage": "analyze_sentiment(text=\"I love this product!\")"
}
]

Why MCO? 💡
Enhanced Development Efficiency: Developers can focus on their own modules with independent testing and deployment.

Simplified Error Management: Errors remain confined within their modules, enabling quick hotfixes.

Future-Proofing: With potential features like remote function calls (RPC), access control, auto-documentation, and a function marketplace, MCO paves the way for rapid innovation.

Practical Use & Community 🤝
The MCO implementation has been successfully tested on Vidraft’s LLM (based on Google Gemma-3)

posted an update 8 days ago

Post

898

I want to process AI for free. I know about Hyra AI, Acurast, NATIX, and some other stuff you can do on your phone. I mean that I want to process toward your projects for free on my computer. I can do a little now, but I can do much more if I'm able to upgrade (nobody is telling me where they're getting H100s, but I may be able to get custom cards from the source). I was curious if any distributed processing is being done with PC and HPC, like BOINC and Folding@home, but specifically for AI, and I figured this is the place to ask.

What projects can you recommend to put my CPU and GPU to use until I potentially get a dual CPU, dual to triple custom GPU, custom NPU, and mini-OPU setup, like Jean Zay, but smaller? I don't have that many resources to put to use currently, but I have more than the Androids I'm using for my Aiyara cluster for BOINC, so help me use the gaming PC for something more useful than gaming. I had somewhat promised that I'd offer the new setup to process for others, but I'm starting before I may even get it.

1 reply

·

reacted to JLouisBiz's post with 👍 8 days ago

Post

3518

Article: https://huggingface.co/blog/JLouisBiz/semantical-website-links

You don't need to do the tedious work of finding all those links on your huge website.

Automating semantic links on websites using Large Language Models (LLMs) enhances user experience and efficiency. Here's a simplified workflow:

1. Store LLM embeddings in PostgreSQL: Use the vector data type to store text embeddings generated by an LLM.
2. Divide page texts into chunks for processing.
3. Generate embeddings using an LLM for each chunk of text.
4. Create template markup around specific terms needing links.

An automated program then:

- Converts marked-up terms to their corresponding LLMs' embeddings,
- Compares these with stored database embeddings (using cosine similarity),
- Identifies the most relevant page based on highest similarity score, and
- Automatically adds a link from the original content to this contextually related information.

This process improves navigation by directing users to highly contextual pages. It saves time as it automates creating semantic links while maintaining accuracy.

reacted to zamal's post with 👍 8 days ago

Post

2546

DeepGit: Your GitHub Gold Digger! 💰🚀
Hey Hugging Face gang! Meet DeepGit—my open-source sidekick that rips through GitHub to snag repos that fit you. Done with dead-end searches? Me too. Built it with LangGraph and some dope tricks:
Embeddings grab the good stuff (HF magic, baby!)

Re-ranking nails the best picks

Snoops docs, code, and buzz in one slick flow

Drops a clean list of hidden gems 💎

Unearth that sneaky ML lib or Python gem—run python app.py or langgraph dev and boom! Peek it at https://github.com/zamalali/DeepGit. Fork it, tweak it, love it—Docker’s in, HF vibes are strong. Drop a 🌟 or a crazy idea—I’m pumped to jam with you all! 🪂

reacted to jeffboudier's post with 🤗 8 days ago

Post

1523

Enterprise orgs now enable serverless Inference Providers for all members
- includes $2 free usage per org member (e.g. an Enterprise org with 1,000 members share $2,000 free credit each month)
- admins can set a monthly spend limit for the entire org
- works today with Together, fal, Novita, Cerebras and HF Inference.

Here's the doc to bill Inference Providers usage to your org: https://huggingface.co/docs/inference-providers/pricing#organization-billing

2 replies

·

reacted to fdaudens's post with 🔥 8 days ago

Post

2247

Did we just drop personalized AI evaluation?! This tool auto-generates custom benchmarks on your docs to test which models are the best.

Most benchmarks test general capabilities, but what matters is how models handle your data and tasks. YourBench helps answer critical questions like:
- Do you really need a hundreds-of-billions-parameter model sledgehammer to crack a nut?
- Could a smaller, fine-tuned model work better?
- How well do different models understand your domain?

Some cool features:
📚 Generates custom benchmarks from your own documents (PDFs, Word, HTML)
🎯 Tests models on real tasks, not just general capabilities
🔄 Supports multiple models for different pipeline stages
🧠 Generate both single-hop and multi-hop questions
🔍 Evaluate top models and deploy leaderboards instantly
💰 Full cost analysis to optimize for your budget
🛠️ Fully configurable via a single YAML file

26 SOTA models tested for question generation. Interesting finding: Qwen2.5 32B leads in question diversity, while smaller Qwen models and Gemini 2.0 Flash offer great value for cost.

You can also run it locally on any models you want.

I'm impressed. Try it out: yourbench/demo

replied to onekq's post 8 days ago

ZBT SRAM (perhaps stacked) caches all over everything packed closely together and more cores and threads, then. Something about Cyclops64 architecture still appeals to me. But then, are you just making a smaller WSE-3? Somehow, I wish OPUs would catch on for the tasks they're optimized for. But NPUs may be beating GPUs more often in common usage soon.

reacted to clem's post with ❤️ 8 days ago

Post

4517

Hugging Face is becoming the best place to share the most viral AI apps with spaces.

Kolors Virtual Try-on just crossed 6,000,000 unique visitors & is now the #5 most popular space. Congrats to the Kwai Kolors team!

Kwai-Kolors/Kolors-Virtual-Try-On

2 replies

·

reacted to fdaudens's post with 👀 8 days ago

Post

4058

🎨 Designers, meet OmniSVG! This new model helps you create professional vector graphics from text/images, generate editable SVGs from icons to detailed characters, convert rasters to vectors, maintain style consistency with references, and integrate into your workflow.

@OmniSVG

2 replies

·

reacted to rajistics's post with 👍 8 days ago

Post

3369

Having some fun with long context benchmarks (watch the video!!)

NoLiMA: NoLiMa: Long-Context Evaluation Beyond Literal Matching (2502.05167)
Fiction LiveBench: https://fiction.live/stories/Fiction-liveBench-Mar-25-2025/oQdzQvKHw8JyXbN87
Michalenglo: https://deepmind.google/research/publications/117639/
LongGenBench: Spinning the Golden Thread: Benchmarking Long-Form Generation in Language Models (2409.02076)
NeedleBench: NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window? (2407.11963)
RULER: RULER: What's the Real Context Size of Your Long-Context Language Models? (2404.06654)

For more: https://www.reddit.com/r/rajistics/comments/1jxwk29/long_context_llm_benchmarks_video/

let me know if you like these posts

reacted to Kseniase's post with 👍 8 days ago

Post

5459

16 new research on inference-time scaling:

For the last couple of weeks a large amount of studies on inference-time scaling has emerged. And it's so cool, because each new paper adds a trick to the toolbox, making LLMs more capable without needing to scale parameter count of the models.

So here are 13 new methods + 3 comprehensive studies on test-time scaling:

1. Inference-Time Scaling for Generalist Reward Modeling (2504.02495)
Probably, the most popular study. It proposes to boost inference-time scalability by improving reward modeling. To enhance performance, DeepSeek-GRM uses adaptive critiques, parallel sampling, pointwise generative RM, and Self-Principled Critique Tuning (SPCT)

2. T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models (2504.04718)
Allows small models to use external tools, like code interpreters and calculator, to enhance self-verification

3. Z1: Efficient Test-time Scaling with Code (2504.00810)
Proposes to train LLMs on code-based reasoning paths to make test-time scaling more efficient, limiting unnecessary tokens with a special dataset and a Shifted Thinking Window

4. GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning (2504.00891)
Introduces GenPRM, a generative PRM, that uses CoT reasoning and code verification for step-by-step judgment. With only 23K training examples, GenPRM outperforms prior PRMs and larger models

5. Can Test-Time Scaling Improve World Foundation Model? (2503.24320)
SWIFT test-time scaling framework improves World Models' performance without retraining, using strategies like fast tokenization, Top-K pruning, and efficient beam search

6. Relevance Isn't All You Need: Scaling RAG Systems With Inference-Time Compute Via Multi-Criteria Reranking (2504.07104)
Proposes REBEL for RAG systems scaling, which uses multi-criteria optimization with CoT prompting for better performance-speed tradeoffs as inference compute increases

7. $φ$-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation (2503.13288)
Proposes a φ-Decoding strategy that uses foresight sampling, clustering and adaptive pruning to estimate and select optimal reasoning steps

Read further below 👇

Also, subscribe to the Turing Post https://www.turingpost.com/subscribe

2 replies

·

reacted to jasoncorkill's post with 🔥 10 days ago

Post

3217

🚀 We tried something new!

We just published a dataset using a new (for us) preference modality: direct ranking based on aesthetic preference. We ranked a couple of thousand images from most to least preferred, all sampled from the Open Image Preferences v1 dataset by the amazing @data-is-better-together team.

📊 Check it out here:
Rapidata/2k-ranked-images-open-image-preferences-v1

We're really curious to hear your thoughts!
Is this kind of ranking interesting or useful to you? Let us know! 💬

If it is, please consider leaving a ❤️ and if we hit 30 ❤️s, we’ll go ahead and rank the full 17k image dataset!

5 replies

·

reacted to merterbak's post with 🔥 10 days ago

Post

3002

OpenAI has released BrowseComp an open source benchmark designed to evaluate the web browsing capabilities of AI agents. This dataset comprising 1,266 questions challenges AI models to navigate the web and uncover complex and obscure information. Crafted by human trainers, the questions are intentionally difficult. (unsolvable by another person in under ten minutes and beyond the reach of existing models like ChatGPT with and without browsing and an early version of OpenAI's Deep Research tool.)

Blog Post: https://openai.com/index/browsecomp/
Paper: https://cdn.openai.com/pdf/5e10f4ab-d6f7-442e-9508-59515c65e35d/browsecomp.pdf
Code in simple eval repo: https://github.com/openai/simple-evals

replied to nomadicsynth's post 10 days ago

Also, tax write-offs for charity research that is bound to make some profits, I'm sure.

Joseph Robert Turcotte PRO

AI & ML interests

Recent Activity

Organizations

Fishtiks's activity