base_model: unsloth/mistral-7b-instruct-v0.3-bnb-4bit
library_name: transformers
license: apache-2.0
pipeline_tag: text-generation
tags:
- bias-detection
- logical-fallacy
- critical-thinking
- rationality
- unsloth
language:
- en
Model Card for Model ID
This is a Qlora specifically dedicated to the identification of sophism and cognitive bias His performance for now is 85%-100% in detecting sophism , and 85%-100% for detectiong cognitive bias
It was trained with a custom dataset of 14k lines
Model Details
Model Description
- Developed by: Arthur Vigier
- Model type: Qlora
- Language(s) (NLP): English
- License: Apache 2.0
- Finetuned from model : mistral-7b-instruct-v0.3-bnb-4bit
Uses
It is dedicated to be used by anyone that want to judge public discourse based on the fundational basis of there language and the solidity of it. Using for education and increasing critical thinking is also a good way to use this tool
API
PUBLIC API COMING SOON
Performance chart
Sophism
[]
Cognitive Bias
[]
Direct Use
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import re
class RationalityDebugger:
def __init__(self, base_model="mistralai/Mistral-7B-v0.1", lora_model="Artvv/rationality-debugger-v1.0"):
"""
Initialize the cognitive bias and logical fallacy detector.
Args:
base_model: Base model from Hugging Face
lora_model: LoRA adapters for rationality analysis
"""
print(f"Loading base model: {base_model}")
self.tokenizer = AutoTokenizer.from_pretrained(base_model)
# Options for optimized loading
model_kwargs = {
"torch_dtype": torch.float16,
"device_map": "auto",
"low_cpu_mem_usage": True
}
# Try first with 4-bit quantization to save memory
try:
from transformers import BitsAndBytesConfig
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True
)
model_kwargs["quantization_config"] = quantization_config
self.base_model = AutoModelForCausalLM.from_pretrained(base_model, **model_kwargs)
except:
# Fallback if bitsandbytes is not available
print("4-bit quantization not available, using standard loading...")
self.base_model = AutoModelForCausalLM.from_pretrained(base_model, **model_kwargs)
print(f"Applying LoRA adapters: {lora_model}")
self.model = PeftModel.from_pretrained(self.base_model, lora_model)
self.model.eval() # Evaluation mode
self.prompt_template = """
Analyze the following argument and identify any logical fallacies or cognitive biases:
{text}
###OUTPUT FORMAT
[Argument] Valid/Invalid
→ If Valid: Type: [ANALYTICAL / INDUCTIVE / ABDUCTIVE]
[Sophisms] Yes/No
→ If Yes: Which: [List detected fallacies]
→ Extract(s): [Provide exact snippet(s)]
[Biases] Yes/No
→ If Yes: Which: [List detected biases]
→ Extract(s): [Provide exact snippet(s)]
[Short explanation]
"""
def analyze(self, text, max_new_tokens=200, temperature=0.1):
"""
Analyze text to detect cognitive biases and logical fallacies.
Args:
text: Text to analyze
max_new_tokens: Maximum number of new tokens to generate
temperature: Temperature for generation (lower = more deterministic)
Returns:
dict: Structured analysis result and raw text
"""
prompt = self.prompt_template.format(text=text)
inputs = self.tokenizer(prompt, return_tensors="pt").to(self.model.device)
with torch.no_grad():
outputs = self.model.generate(
**inputs,
max_new_tokens=max_new_tokens,
temperature=temperature,
top_p=0.9,
do_sample=temperature > 0
)
# Extract only the generated part (not the prompt)
generated_text = self.tokenizer.decode(
outputs[0][inputs.input_ids.shape[1]:],
skip_special_tokens=True
)
# Parse the response to extract the structure
result = self._parse_response(generated_text)
return {
"raw_text": generated_text,
"structured": result
}
def _parse_response(self, text):
"""Parse the model's response to extract structured information"""
result = {
"argument_valid": None,
"argument_type": None,
"has_sophisms": None,
"detected_sophisms": [],
"has_biases": None,
"detected_biases": [],
"too_short": False,
"explanation": ""
}
# Simple parsing example - adapt as needed
text_lower = text.lower()
# Argument validity detection
if "valid argument" in text_lower or "[argument] valid" in text_lower:
result["argument_valid"] = True
elif "invalid argument" in text_lower or "[argument] invalid" in text_lower:
result["argument_valid"] = False
# Argument type detection
for arg_type in ["ANALYTICAL", "INDUCTIVE", "ABDUCTIVE"]:
if arg_type.lower() in text_lower:
result["argument_type"] = arg_type
# Fallacy detection
sophism_keywords = ["ad hominem", "straw man", "red herring", "false dilemma",
"slippery slope", "post hoc", "circular reasoning"]
for sophism in sophism_keywords:
if sophism in text_lower:
result["detected_sophisms"].append(sophism)
result["has_sophisms"] = len(result["detected_sophisms"]) > 0
# Cognitive bias detection
bias_keywords = ["confirmation bias", "availability bias", "anchoring bias",
"hindsight bias", "halo effect", "dunning-kruger"]
for bias in bias_keywords:
if bias in text_lower:
result["detected_biases"].append(bias)
result["has_biases"] = len(result["detected_biases"]) > 0
# Explanation
explanation_match = re.search(r"\[Short explanation\](.*?)(?=$|\[)", text, re.DOTALL)
if explanation_match:
result["explanation"] = explanation_match.group(1).strip()
else:
# If no explanation tag, take the whole text
result["explanation"] = text
return result
# --- Usage example ---
if __name__ == "__main__":
# Create the analyzer
analyzer = RationalityDebugger(
base_model="mistralai/Mistral-7B-v0.1",
lora_model="Artvv/rationality-debugger-v1.0"
)
# Analysis example
argument = """
All birds can fly. Penguins are birds. Therefore, penguins can fly.
"""
result = analyzer.analyze(argument)
# Display raw result
print("\n=== RAW ANALYSIS ===")
print(result["raw_text"])
# Display structured result
print("\n=== STRUCTURED ANALYSIS ===")
print(f"Valid argument: {result['structured']['argument_valid']}")
if result["structured"]["detected_sophisms"]:
print("\nDetected fallacies:")
for sophism in result["structured"]["detected_sophisms"]:
print(f"- {sophism}")
if result["structured"]["detected_biases"]:
print("\nDetected cognitive biases:")
for bias in result["structured"]["detected_biases"]:
print(f"- {bias}")
print("\nExplanation:")
print(result["structured"]["explanation"])
Out-of-Scope Use
It is not intended to harass anyone or being rude
Bias, Risks, and Limitations
[More Information Needed]
Recommendations
He is very efficient to the most common sophism and cognitive bias but for some more niche like bias frequency illusion he can be less efficient. He is mainly dedicated to detect sophism and cognitive bias , he can detect valid reasoning but it is not his main purpose
Model Card Contact
mail : [email protected]
Framework versions
- PEFT 0.14.0