Update README.md
Browse files
README.md
CHANGED
@@ -2,7 +2,6 @@
|
|
2 |
library_name: transformers
|
3 |
license: cc-by-nc-4.0
|
4 |
datasets:
|
5 |
-
- oumi-ai/oumi-anli-subset
|
6 |
- oumi-ai/oumi-c2d-d2c-subset
|
7 |
- oumi-ai/oumi-synthetic-claims
|
8 |
- oumi-ai/oumi-synthetic-document-claims
|
@@ -23,16 +22,16 @@ base_model:
|
|
23 |
|
24 |
<!-- Provide a quick summary of what the model is/does. -->
|
25 |
|
26 |
-
Introducing **HallOumi-8B-classifier**, a **SOTA hallucination detection model**, outperforming DeepSeek R1, OpenAI o1, Google Gemini 1.5 Pro, and Anthropic Sonnet 3.5 at only
|
27 |
|
28 |
-
Give HallOumi a try now!
|
29 |
|
30 |
-
* Demo: https://oumi.ai/halloumi-demo
|
31 |
-
* Github: https://github.com/oumi-ai/oumi/tree/main/configs/projects/halloumi
|
32 |
|
33 |
| Model | Balanced Accuracy | Macro F1 Score | Open Source? | Model Size |
|
34 |
| --------------------- | ----------------- | --------------------------------------- | ------------ | ---------- |
|
35 |
-
| **HallOumi-8B** | **
|
36 |
| Anthropic Sonnet 3.5 | 67.3% ± 2.7% | 69.6% ± 2.8% | ❌ | ?? |
|
37 |
| OpenAI o1-preview | 64.5% ± 2.0% | 65.9% ± 2.3% | ❌ | ?? |
|
38 |
| DeepSeek R1 | 60.7% ± 2.1% | 61.6% ± 2.5% | ✔️ | 671B |
|
@@ -47,7 +46,7 @@ For example, when given one or more context documents, as well as an AI-generate
|
|
47 |
* A determination whether that particular statement is **supported or unsupported** by the provided context.
|
48 |
* An **explanation** describing why a particular claim is supported or unsupported.
|
49 |
|
50 |
-
**HallOumi-8B-classifier
|
51 |
* ✔️ Fast
|
52 |
* ✔️ Per-claim support (must call once per claim)
|
53 |
* ❌ No Explanations
|
@@ -79,7 +78,7 @@ however, this is not enough, as we have to be capable of doing these things in a
|
|
79 |
- **Language(s) (NLP):** English
|
80 |
- **License:** [CC-BY-NC-4.0](https://creativecommons.org/licenses/by-nc/4.0/deed.en)
|
81 |
- **Finetuned from model:** [Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
|
82 |
-
- **Demo:** [HallOumi Demo](https://oumi.ai/halloumi)
|
83 |
|
84 |
---
|
85 |
|
@@ -88,7 +87,7 @@ however, this is not enough, as we have to be capable of doing these things in a
|
|
88 |
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
89 |
Use to verify claims/detect hallucinations in scenarios where a known source of truth is available.
|
90 |
|
91 |
-
Demo: https://oumi.ai/halloumi
|
92 |
|
93 |
## Out-of-Scope Use
|
94 |
|
@@ -125,11 +124,11 @@ Eval notebook: Coming Soon
|
|
125 |
|
126 |
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
|
127 |
|
128 |
-
- **Hardware Type:**
|
129 |
-
- **Hours used:**
|
130 |
- **Cloud Provider:** Google Cloud Platform
|
131 |
- **Compute Region:** us-east5
|
132 |
-
- **Carbon Emitted:**
|
133 |
|
134 |
## Citation
|
135 |
|
@@ -137,11 +136,11 @@ Eval notebook: Coming Soon
|
|
137 |
|
138 |
```
|
139 |
@misc{oumiHalloumi8BClassifier,
|
140 |
-
author = {Jeremiah Greer},
|
141 |
title = {HallOumi-8B-classifier},
|
142 |
month = {March},
|
143 |
year = {2025},
|
144 |
-
url = {https://huggingface.co/oumi-ai/HallOumi-8B}
|
145 |
}
|
146 |
|
147 |
@software{oumi2025,
|
|
|
2 |
library_name: transformers
|
3 |
license: cc-by-nc-4.0
|
4 |
datasets:
|
|
|
5 |
- oumi-ai/oumi-c2d-d2c-subset
|
6 |
- oumi-ai/oumi-synthetic-claims
|
7 |
- oumi-ai/oumi-synthetic-document-claims
|
|
|
22 |
|
23 |
<!-- Provide a quick summary of what the model is/does. -->
|
24 |
|
25 |
+
Introducing **HallOumi-8B-classifier**, a _fast_ **SOTA hallucination detection model**, outperforming DeepSeek R1, OpenAI o1, Google Gemini 1.5 Pro, and Anthropic Sonnet 3.5 at only 8 billion parameters!
|
26 |
|
27 |
+
<!-- Give HallOumi a try now! -->
|
28 |
|
29 |
+
<!-- * Demo: https://oumi.ai/halloumi-demo -->
|
30 |
+
<!-- * Github: https://github.com/oumi-ai/oumi/tree/main/configs/projects/halloumi -->
|
31 |
|
32 |
| Model | Balanced Accuracy | Macro F1 Score | Open Source? | Model Size |
|
33 |
| --------------------- | ----------------- | --------------------------------------- | ------------ | ---------- |
|
34 |
+
| **HallOumi-8B-classifier** | **76.8% ± 2.0%** | **78.5% ± 2.1%** | ✔️ | 8B |
|
35 |
| Anthropic Sonnet 3.5 | 67.3% ± 2.7% | 69.6% ± 2.8% | ❌ | ?? |
|
36 |
| OpenAI o1-preview | 64.5% ± 2.0% | 65.9% ± 2.3% | ❌ | ?? |
|
37 |
| DeepSeek R1 | 60.7% ± 2.1% | 61.6% ± 2.5% | ✔️ | 671B |
|
|
|
46 |
* A determination whether that particular statement is **supported or unsupported** by the provided context.
|
47 |
* An **explanation** describing why a particular claim is supported or unsupported.
|
48 |
|
49 |
+
**HallOumi-8B-classifier**, the hallucination classification model built with Oumi, is an end-to-end classification system that enables *fast and accurate* assessment of the hallucination probability of any written content (AI or human-generated).
|
50 |
* ✔️ Fast
|
51 |
* ✔️ Per-claim support (must call once per claim)
|
52 |
* ❌ No Explanations
|
|
|
78 |
- **Language(s) (NLP):** English
|
79 |
- **License:** [CC-BY-NC-4.0](https://creativecommons.org/licenses/by-nc/4.0/deed.en)
|
80 |
- **Finetuned from model:** [Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
|
81 |
+
<!-- - **Demo:** [HallOumi Demo](https://oumi.ai/halloumi) -->
|
82 |
|
83 |
---
|
84 |
|
|
|
87 |
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
88 |
Use to verify claims/detect hallucinations in scenarios where a known source of truth is available.
|
89 |
|
90 |
+
<!-- Demo: https://oumi.ai/halloumi -->
|
91 |
|
92 |
## Out-of-Scope Use
|
93 |
|
|
|
124 |
|
125 |
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
|
126 |
|
127 |
+
- **Hardware Type:** A100-80GB
|
128 |
+
- **Hours used:** 1.5 (4 * 8 GPUs)
|
129 |
- **Cloud Provider:** Google Cloud Platform
|
130 |
- **Compute Region:** us-east5
|
131 |
+
- **Carbon Emitted:** 0.15 kg
|
132 |
|
133 |
## Citation
|
134 |
|
|
|
136 |
|
137 |
```
|
138 |
@misc{oumiHalloumi8BClassifier,
|
139 |
+
author = {Achlioptas Panos, Jeremiah Greer, Aisopos Kostas, Schuler A. Michael, Elachqar Oussama, Koukoumidis Emmanouil},
|
140 |
title = {HallOumi-8B-classifier},
|
141 |
month = {March},
|
142 |
year = {2025},
|
143 |
+
url = {https://huggingface.co/oumi-ai/HallOumi-8B-classifier}
|
144 |
}
|
145 |
|
146 |
@software{oumi2025,
|