hal

community

https://github.com/benediktstroebl/agent-eval-harness/tree/main

benediktstroebl

benediktstroebl

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

benediktstroebl updated a dataset about 13 hours ago

agent-evals/hal_traces

benediktstroebl updated a dataset about 13 hours ago

agent-evals/hal_traces

ronch99 authored a paper 7 months ago

Automatic Evaluation of Attribution by Large Language Models

View all activity

spaces 2

Agent Leaderboard

Display agent leaderboards for various benchmarks

Agent Leaderboard

models

None public yet

datasets 3

agent-evals/hal_traces

Updated about 13 hours ago • 67

agent-evals/agent_traces

Updated 17 days ago • 482

agent-evals/results

Updated Jan 16 • 16