Text Classification
Transformers
Safetensors
English
llama
text-generation-inference
panos-lema commited on
Commit
45377fb
·
verified ·
1 Parent(s): e309fb3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -14
README.md CHANGED
@@ -2,7 +2,6 @@
2
  library_name: transformers
3
  license: cc-by-nc-4.0
4
  datasets:
5
- - oumi-ai/oumi-anli-subset
6
  - oumi-ai/oumi-c2d-d2c-subset
7
  - oumi-ai/oumi-synthetic-claims
8
  - oumi-ai/oumi-synthetic-document-claims
@@ -23,16 +22,16 @@ base_model:
23
 
24
  <!-- Provide a quick summary of what the model is/does. -->
25
 
26
- Introducing **HallOumi-8B-classifier**, a **SOTA hallucination detection model**, outperforming DeepSeek R1, OpenAI o1, Google Gemini 1.5 Pro, and Anthropic Sonnet 3.5 at only **8 billion parameters!**
27
 
28
- Give HallOumi a try now!
29
 
30
- * Demo: https://oumi.ai/halloumi-demo
31
- * Github: https://github.com/oumi-ai/oumi/tree/main/configs/projects/halloumi
32
 
33
  | Model | Balanced Accuracy | Macro F1 Score | Open Source? | Model Size |
34
  | --------------------- | ----------------- | --------------------------------------- | ------------ | ---------- |
35
- | **HallOumi-8B** | **73.0% ± 2.2%** | **75.1% ± 2.2%** | ✔️ | 8B |
36
  | Anthropic Sonnet 3.5 | 67.3% ± 2.7% | 69.6% ± 2.8% | ❌ | ?? |
37
  | OpenAI o1-preview | 64.5% ± 2.0% | 65.9% ± 2.3% | ❌ | ?? |
38
  | DeepSeek R1 | 60.7% ± 2.1% | 61.6% ± 2.5% | ✔️ | 671B |
@@ -47,7 +46,7 @@ For example, when given one or more context documents, as well as an AI-generate
47
  * A determination whether that particular statement is **supported or unsupported** by the provided context.
48
  * An **explanation** describing why a particular claim is supported or unsupported.
49
 
50
- **HallOumi-8B-classifier** is trained with similar data to HallOumi-8B but is instead trained as a classifier rather than a generative model.
51
  * ✔️ Fast
52
  * ✔️ Per-claim support (must call once per claim)
53
  * ❌ No Explanations
@@ -79,7 +78,7 @@ however, this is not enough, as we have to be capable of doing these things in a
79
  - **Language(s) (NLP):** English
80
  - **License:** [CC-BY-NC-4.0](https://creativecommons.org/licenses/by-nc/4.0/deed.en)
81
  - **Finetuned from model:** [Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
82
- - **Demo:** [HallOumi Demo](https://oumi.ai/halloumi)
83
 
84
  ---
85
 
@@ -88,7 +87,7 @@ however, this is not enough, as we have to be capable of doing these things in a
88
  <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
89
  Use to verify claims/detect hallucinations in scenarios where a known source of truth is available.
90
 
91
- Demo: https://oumi.ai/halloumi
92
 
93
  ## Out-of-Scope Use
94
 
@@ -125,11 +124,11 @@ Eval notebook: Coming Soon
125
 
126
  <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
127
 
128
- - **Hardware Type:** H100
129
- - **Hours used:** 32 (4 * 8 GPUs)
130
  - **Cloud Provider:** Google Cloud Platform
131
  - **Compute Region:** us-east5
132
- - **Carbon Emitted:** 2.8 kg
133
 
134
  ## Citation
135
 
@@ -137,11 +136,11 @@ Eval notebook: Coming Soon
137
 
138
  ```
139
  @misc{oumiHalloumi8BClassifier,
140
- author = {Jeremiah Greer},
141
  title = {HallOumi-8B-classifier},
142
  month = {March},
143
  year = {2025},
144
- url = {https://huggingface.co/oumi-ai/HallOumi-8B}
145
  }
146
 
147
  @software{oumi2025,
 
2
  library_name: transformers
3
  license: cc-by-nc-4.0
4
  datasets:
 
5
  - oumi-ai/oumi-c2d-d2c-subset
6
  - oumi-ai/oumi-synthetic-claims
7
  - oumi-ai/oumi-synthetic-document-claims
 
22
 
23
  <!-- Provide a quick summary of what the model is/does. -->
24
 
25
+ Introducing **HallOumi-8B-classifier**, a _fast_ **SOTA hallucination detection model**, outperforming DeepSeek R1, OpenAI o1, Google Gemini 1.5 Pro, and Anthropic Sonnet 3.5 at only 8 billion parameters!
26
 
27
+ <!-- Give HallOumi a try now! -->
28
 
29
+ <!-- * Demo: https://oumi.ai/halloumi-demo -->
30
+ <!-- * Github: https://github.com/oumi-ai/oumi/tree/main/configs/projects/halloumi -->
31
 
32
  | Model | Balanced Accuracy | Macro F1 Score | Open Source? | Model Size |
33
  | --------------------- | ----------------- | --------------------------------------- | ------------ | ---------- |
34
+ | **HallOumi-8B-classifier** | **76.8% ± 2.0%** | **78.5% ± 2.1%** | ✔️ | 8B |
35
  | Anthropic Sonnet 3.5 | 67.3% ± 2.7% | 69.6% ± 2.8% | ❌ | ?? |
36
  | OpenAI o1-preview | 64.5% ± 2.0% | 65.9% ± 2.3% | ❌ | ?? |
37
  | DeepSeek R1 | 60.7% ± 2.1% | 61.6% ± 2.5% | ✔️ | 671B |
 
46
  * A determination whether that particular statement is **supported or unsupported** by the provided context.
47
  * An **explanation** describing why a particular claim is supported or unsupported.
48
 
49
+ **HallOumi-8B-classifier**, the hallucination classification model built with Oumi, is an end-to-end classification system that enables *fast and accurate* assessment of the hallucination probability of any written content (AI or human-generated).
50
  * ✔️ Fast
51
  * ✔️ Per-claim support (must call once per claim)
52
  * ❌ No Explanations
 
78
  - **Language(s) (NLP):** English
79
  - **License:** [CC-BY-NC-4.0](https://creativecommons.org/licenses/by-nc/4.0/deed.en)
80
  - **Finetuned from model:** [Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
81
+ <!-- - **Demo:** [HallOumi Demo](https://oumi.ai/halloumi) -->
82
 
83
  ---
84
 
 
87
  <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
88
  Use to verify claims/detect hallucinations in scenarios where a known source of truth is available.
89
 
90
+ <!-- Demo: https://oumi.ai/halloumi -->
91
 
92
  ## Out-of-Scope Use
93
 
 
124
 
125
  <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
126
 
127
+ - **Hardware Type:** A100-80GB
128
+ - **Hours used:** 1.5 (4 * 8 GPUs)
129
  - **Cloud Provider:** Google Cloud Platform
130
  - **Compute Region:** us-east5
131
+ - **Carbon Emitted:** 0.15 kg
132
 
133
  ## Citation
134
 
 
136
 
137
  ```
138
  @misc{oumiHalloumi8BClassifier,
139
+ author = {Achlioptas Panos, Jeremiah Greer, Aisopos Kostas, Schuler A. Michael, Elachqar Oussama, Koukoumidis Emmanouil},
140
  title = {HallOumi-8B-classifier},
141
  month = {March},
142
  year = {2025},
143
+ url = {https://huggingface.co/oumi-ai/HallOumi-8B-classifier}
144
  }
145
 
146
  @software{oumi2025,