Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
1
3
Roee Aharoni
roeeaharoni
Follow
Agnuxo's profile picture
furkanyigit0's profile picture
IbraahimLab's profile picture
12 followers
Β·
2 following
http://www.roeeaharoni.com
roeeaharoni
roeeaharoni
AI & ML interests
Natural Language Processing
Recent Activity
upvoted
a
paper
27 days ago
Inside-Out: Hidden Factual Knowledge in LLMs
liked
a dataset
9 months ago
google/granola-entity-questions
reacted
to
gsarti
's
post
with π€
about 1 year ago
π Today's pick in Interpretability & Analysis of LMs: A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains by @alonjacovi @yonatanbitton B. Bohnet J. Herzig @orhonovic M. Tseng M. Collins @roeeaharoni @mega This work introduces a new methodology for human verification of reasoning chains and adopts it to annotate a dataset of chain-of-thought reasoning chains produced by 3 LMs. The annotated dataset, REVEAL, can be used to benchmark automatic verifiers of reasoning in LMs. In their analysis, the authors find that LM-produced CoTs generally contain faulty steps, often leading to incorrect automatic verification. In particular, CoT-generating LMs are found to produce non-attributable reasoning steps often, and reasoning verifiers generally struggle to verify logical correctness. π Paper: https://huggingface.co/papers/2402.00559 π‘ Dataset: https://huggingface.co/datasets/google/reveal
View all activity
Organizations
roeeaharoni
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a dataset
9 months ago
google/granola-entity-questions
Viewer
β’
Updated
Aug 1, 2024
β’
12.5k
β’
111
β’
8
liked
a model
over 1 year ago
google/t5_11b_trueteacher_and_anli
Text2Text Generation
β’
Updated
Dec 26, 2023
β’
441
β’
16
liked
a model
about 2 years ago
google/t5_xxl_true_nli_mixture
Text2Text Generation
β’
Updated
Mar 23, 2023
β’
2.87k
β’
46