Gregor Betz PRO

ggbetz

AI & ML interests

Reasoning, AGI, AI Safety, AI Reliability

Recent Activity

published a Space 13 days ago
DebateLabKIT/argdown-feedback
updated a Space 13 days ago
DebateLabKIT/argdown-feedback
View all activity

Organizations

Logikon AI's profile picture DebateLab at KIT's profile picture MLX Community's profile picture Open CoT Leaderboard's profile picture

ggbetz's activity

reacted to thomwolf's post with β€οΈπŸ‘ 25 days ago
view post
Post
3257
The new DeepSite space is really insane for vibe-coders
enzostvs/deepsite

With the wave of vibe-coding-optimized LLMs like the latest open-source DeepSeek model (version V3-0324), you can basically prompt out-of-the-box and create any app and game in one-shot.

It feels so powerful to me, no more complex framework or under-the-hood prompt engineering to have a working text-to-app tool.

AI is eating the world and *open-source* AI is eating AI itself!

PS: and even more meta is that the DeepSite app and DeepSeek model are both fully open-source code => time to start recursively improve?

PPS: you still need some inference hosting unless you're running the 600B param model at home, so check the very nice list of HF Inference Providers for this model: deepseek-ai/DeepSeek-V3-0324
  • 1 reply
Β·
replied to singhsidhukuldeep's post 3 months ago
posted an update 3 months ago
view post
Post
1888
We've just released syncIALO -- a multi-purpose synthetic debate and argument mapping corpus with more than 600k arguments:

πŸ“ Blog article: https://huggingface.co/blog/ggbetz/introducing-syncialo
πŸ›’οΈ Dataset: DebateLabKIT/syncialo-raw
πŸ‘©β€πŸ’» Code: https://github.com/debatelab/syncIALO

πŸ€— Hugging Face has sponsored the syncIALO project through inference time / compute credits. πŸ™ We gratefully acknowledge the generous support. 🫢
posted an update 8 months ago
view post
Post
1501
Hi, just a brief follow-up on our Guided Reasoning (GuiR) system:

I've created a template space that facilitates testing:

1. Duplicate space logikon/guir-chat
2. Setup your own inference servers and provide details in config file
3. Add api keys as secrets
4. Your personal GuiR playground is ready

Cheers, Gregor
posted an update 8 months ago
view post
Post
1205
🧭 Guided Reasoning

πŸ‘‹Hi everyone,

We've been releasing Guided Reasoning:

Our AI guides walk your favorite LLM through complex reasoning problems.

🎯 Goals:

1️⃣ Reliability. AIs consistently follow reasoning methods.
2️⃣ Self-explainability. AIs see reasoning protocols and can explain internal deliberation.
3️⃣ Contestability. Users may amend AI reasoning and revise plausibility assessments.

Try out Guided Reasoning with our light demo chatbot, powered by πŸ€— HuggingFace's free Inference Api and small LLMs. (Sorry for poor latency and limited availability -- we are currently searching for πŸ’Έ compute sponsors to run more powerful models, faster, and optimize guided reasoning performance.)

Built on top of Logikon's open-source AI reasoning analytics.

Demo chat app: logikon/benjamin-chat
Github: https://github.com/logikon-ai/logikon
Technical report: https://arxiv.org/abs/2408.16331

➑️ Check it out and get involved! Looking forward to hearing from you.
replied to their post about 1 year ago
view reply

Sorry for this, is up and running again.

replied to their post about 1 year ago
view reply

Hi @clefourrier , thx for the quick reply and positive feedback. Absolutely, we'd be happy to collaborate on a blog post. :-) I've been following the Leaderboards on the Hub series -- should we write an initial draft and share it with you, or would you suggest another way to proceed?

posted an update about 1 year ago
view post
Post
1447
πŸ₯‡Open CoT Leaderboard

We're delighted to announce the [Open CoT Leaderboard]( logikon/open_cot_leaderboard) on πŸ€— Spaces.

Unlike other LLM performance leaderboards, the Open CoT Leaderboard is not tracking absolute benchmark accuracies, but relative **accuracy gains** due to **chain-of-thought**.

Eval datasets that underpin the leaderboard are hosted [here]( cot-leaderboard ).

Feedback and suggestions more than welcome.

@clefourrier
Β·