|
--- |
|
base_model: unsloth/mistral-7b-instruct-v0.3-bnb-4bit |
|
library_name: transformers |
|
license: apache-2.0 |
|
pipeline_tag: text-generation |
|
tags: |
|
- bias-detection |
|
- logical-fallacy |
|
- critical-thinking |
|
- rationality |
|
- unsloth |
|
language: |
|
- en |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
This is a Qlora specifically dedicated to the identification of sophism and cognitive bias |
|
His performance for now is 85%-100% in detecting sophism , and 85%-100% for detectiong cognitive bias |
|
|
|
It was trained with a custom dataset of 14k lines |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
|
|
|
|
- **Developed by:** Arthur Vigier |
|
- **Model type:** Qlora |
|
- **Language(s) (NLP):** English |
|
- **License:** Apache 2.0 |
|
- **Finetuned from model :** mistral-7b-instruct-v0.3-bnb-4bit |
|
|
|
## Uses |
|
|
|
It is dedicated to be used by anyone that want to judge public discourse based on the fundational basis of there language and the solidity |
|
of it. Using for education and increasing critical thinking is also a good way to use this tool |
|
|
|
### API |
|
|
|
PUBLIC API COMING SOON |
|
|
|
### Performance chart |
|
|
|
### Sophism |
|
[<img src="https://imgur.com/Vby0Ocq.png" width="500"/>] |
|
### Cognitive Bias |
|
[<img src="https://imgur.com/RbGxSyN.png" width="500"/>] |
|
|
|
### Direct Use |
|
|
|
```python |
|
import torch |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
from peft import PeftModel |
|
import re |
|
|
|
class RationalityDebugger: |
|
def __init__(self, base_model="mistralai/Mistral-7B-v0.1", lora_model="Artvv/rationality-debugger-v1.0"): |
|
""" |
|
Initialize the cognitive bias and logical fallacy detector. |
|
|
|
Args: |
|
base_model: Base model from Hugging Face |
|
lora_model: LoRA adapters for rationality analysis |
|
""" |
|
print(f"Loading base model: {base_model}") |
|
self.tokenizer = AutoTokenizer.from_pretrained(base_model) |
|
|
|
# Options for optimized loading |
|
model_kwargs = { |
|
"torch_dtype": torch.float16, |
|
"device_map": "auto", |
|
"low_cpu_mem_usage": True |
|
} |
|
|
|
# Try first with 4-bit quantization to save memory |
|
try: |
|
from transformers import BitsAndBytesConfig |
|
quantization_config = BitsAndBytesConfig( |
|
load_in_4bit=True, |
|
bnb_4bit_compute_dtype=torch.float16, |
|
bnb_4bit_use_double_quant=True |
|
) |
|
model_kwargs["quantization_config"] = quantization_config |
|
self.base_model = AutoModelForCausalLM.from_pretrained(base_model, **model_kwargs) |
|
except: |
|
# Fallback if bitsandbytes is not available |
|
print("4-bit quantization not available, using standard loading...") |
|
self.base_model = AutoModelForCausalLM.from_pretrained(base_model, **model_kwargs) |
|
|
|
print(f"Applying LoRA adapters: {lora_model}") |
|
self.model = PeftModel.from_pretrained(self.base_model, lora_model) |
|
self.model.eval() # Evaluation mode |
|
|
|
self.prompt_template = """ |
|
Analyze the following argument and identify any logical fallacies or cognitive biases: |
|
|
|
{text} |
|
|
|
###OUTPUT FORMAT |
|
[Argument] Valid/Invalid |
|
→ If Valid: Type: [ANALYTICAL / INDUCTIVE / ABDUCTIVE] |
|
[Sophisms] Yes/No |
|
→ If Yes: Which: [List detected fallacies] |
|
→ Extract(s): [Provide exact snippet(s)] |
|
[Biases] Yes/No |
|
→ If Yes: Which: [List detected biases] |
|
→ Extract(s): [Provide exact snippet(s)] |
|
|
|
[Short explanation] |
|
""" |
|
|
|
def analyze(self, text, max_new_tokens=200, temperature=0.1): |
|
""" |
|
Analyze text to detect cognitive biases and logical fallacies. |
|
|
|
Args: |
|
text: Text to analyze |
|
max_new_tokens: Maximum number of new tokens to generate |
|
temperature: Temperature for generation (lower = more deterministic) |
|
|
|
Returns: |
|
dict: Structured analysis result and raw text |
|
""" |
|
prompt = self.prompt_template.format(text=text) |
|
|
|
inputs = self.tokenizer(prompt, return_tensors="pt").to(self.model.device) |
|
|
|
with torch.no_grad(): |
|
outputs = self.model.generate( |
|
**inputs, |
|
max_new_tokens=max_new_tokens, |
|
temperature=temperature, |
|
top_p=0.9, |
|
do_sample=temperature > 0 |
|
) |
|
|
|
# Extract only the generated part (not the prompt) |
|
generated_text = self.tokenizer.decode( |
|
outputs[0][inputs.input_ids.shape[1]:], |
|
skip_special_tokens=True |
|
) |
|
|
|
# Parse the response to extract the structure |
|
result = self._parse_response(generated_text) |
|
|
|
return { |
|
"raw_text": generated_text, |
|
"structured": result |
|
} |
|
|
|
def _parse_response(self, text): |
|
"""Parse the model's response to extract structured information""" |
|
result = { |
|
"argument_valid": None, |
|
"argument_type": None, |
|
"has_sophisms": None, |
|
"detected_sophisms": [], |
|
"has_biases": None, |
|
"detected_biases": [], |
|
"too_short": False, |
|
"explanation": "" |
|
} |
|
|
|
# Simple parsing example - adapt as needed |
|
text_lower = text.lower() |
|
|
|
# Argument validity detection |
|
if "valid argument" in text_lower or "[argument] valid" in text_lower: |
|
result["argument_valid"] = True |
|
elif "invalid argument" in text_lower or "[argument] invalid" in text_lower: |
|
result["argument_valid"] = False |
|
|
|
# Argument type detection |
|
for arg_type in ["ANALYTICAL", "INDUCTIVE", "ABDUCTIVE"]: |
|
if arg_type.lower() in text_lower: |
|
result["argument_type"] = arg_type |
|
|
|
# Fallacy detection |
|
sophism_keywords = ["ad hominem", "straw man", "red herring", "false dilemma", |
|
"slippery slope", "post hoc", "circular reasoning"] |
|
|
|
for sophism in sophism_keywords: |
|
if sophism in text_lower: |
|
result["detected_sophisms"].append(sophism) |
|
|
|
result["has_sophisms"] = len(result["detected_sophisms"]) > 0 |
|
|
|
# Cognitive bias detection |
|
bias_keywords = ["confirmation bias", "availability bias", "anchoring bias", |
|
"hindsight bias", "halo effect", "dunning-kruger"] |
|
|
|
for bias in bias_keywords: |
|
if bias in text_lower: |
|
result["detected_biases"].append(bias) |
|
|
|
result["has_biases"] = len(result["detected_biases"]) > 0 |
|
|
|
# Explanation |
|
explanation_match = re.search(r"\[Short explanation\](.*?)(?=$|\[)", text, re.DOTALL) |
|
if explanation_match: |
|
result["explanation"] = explanation_match.group(1).strip() |
|
else: |
|
# If no explanation tag, take the whole text |
|
result["explanation"] = text |
|
|
|
return result |
|
|
|
|
|
# --- Usage example --- |
|
if __name__ == "__main__": |
|
# Create the analyzer |
|
analyzer = RationalityDebugger( |
|
base_model="mistralai/Mistral-7B-v0.1", |
|
lora_model="Artvv/rationality-debugger-v1.0" |
|
) |
|
|
|
# Analysis example |
|
argument = """ |
|
All birds can fly. Penguins are birds. Therefore, penguins can fly. |
|
""" |
|
|
|
result = analyzer.analyze(argument) |
|
|
|
# Display raw result |
|
print("\n=== RAW ANALYSIS ===") |
|
print(result["raw_text"]) |
|
|
|
# Display structured result |
|
print("\n=== STRUCTURED ANALYSIS ===") |
|
print(f"Valid argument: {result['structured']['argument_valid']}") |
|
|
|
if result["structured"]["detected_sophisms"]: |
|
print("\nDetected fallacies:") |
|
for sophism in result["structured"]["detected_sophisms"]: |
|
print(f"- {sophism}") |
|
|
|
if result["structured"]["detected_biases"]: |
|
print("\nDetected cognitive biases:") |
|
for bias in result["structured"]["detected_biases"]: |
|
print(f"- {bias}") |
|
|
|
print("\nExplanation:") |
|
print(result["structured"]["explanation"]) |
|
``` |
|
|
|
### Out-of-Scope Use |
|
|
|
It is not intended to harass anyone or being rude |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
<!-- This section is meant to convey both technical and sociotechnical limitations. --> |
|
|
|
[More Information Needed] |
|
|
|
### Recommendations |
|
|
|
He is very efficient to the most common sophism and cognitive bias but for some more niche like bias frequency illusion he can be less efficient. |
|
He is mainly dedicated to detect sophism and cognitive bias , he can detect valid reasoning but it is not his main purpose |
|
|
|
## Model Card Contact |
|
|
|
mail : [email protected] |
|
|
|
|
|
### Framework versions |
|
|
|
- PEFT 0.14.0 |
|
|
|
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth) |