--- library_name: transformers license: apache-2.0 base_model: answerdotai/ModernBERT-base tags: - reasoning - reasoning-datasets-competition datasets: - davanstrien/natural-reasoning-classifier language: - en metrics: - mse - mae - spearman widget: - text: >- The debate on artificial intelligence's role in society has become increasingly polarized. Some argue that AI will lead to widespread unemployment and concentration of power, while others contend it will create new jobs and democratize access to knowledge. These viewpoints reflect different assumptions about technological development, economic systems, and human adaptability. --- # ModernBERT Reasoning Complexity Regressor ModernBERT-based Reasoning Complexity Regressor ## Model Description This model predicts the reasoning complexity level (0-4) that a given web text suggests. It's fine-tuned from [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on the [davanstrien/natural-reasoning-classifier](https://huggingface.co/datasets/davanstrien/natural-reasoning-classifier) dataset. The intended use for the model is in a pipeline to try and identify text that may be useful for generating reasoning data. ### Reasoning Complexity Scale The reasoning complexity scale ranges from: - **0: Minimal Reasoning** - Simple factual content requiring only recall - **1: Basic Reasoning** - Straightforward connections or single-step logical processes - **2: Intermediate Reasoning** - Integration of multiple factors or perspectives - **3: Advanced Reasoning** - Sophisticated analysis across multiple dimensions - **4: Expert Reasoning** - Theoretical frameworks and novel conceptual synthesis ## Performance The model achieves the following results on the evaluation set: - MSE: 0.2034 - MAE: 0.2578 - Spearman Correlation: 0.6963 ## Intended Uses This model can be used to: - Filter and classify educational content by reasoning complexity - Identify complex reasoning problems across diverse domains - Serve as a first-stage filter in a reasoning dataset creation pipeline ## Limitations - Predictions are influenced by the original dataset's domain distribution - Reasoning complexity is subjective and context-dependent ## Training The model was fine-tuned using a regression objective with the following settings: - Learning rate: 5e-05 - Batch size: 16 - Optimizer: AdamW - Schedule: Linear - Epochs: 10 ## Usage Examples ### Using the pipeline API ```python from transformers import pipeline pipe = pipeline("text-classification", model="davanstrien/ModernBERT-based-Reasoning-Required") def predict_reasoning_level(text, pipe): # Get the raw prediction result = pipe(text) score = result[0]['score'] # Round to nearest integer (optional) rounded_score = round(score) # Clip to valid range (0-4) rounded_score = max(0, min(4, rounded_score)) # Create a human-readable interpretation (optional) reasoning_labels = { 0: "No reasoning", 1: "Basic reasoning", 2: "Moderate reasoning", 3: "Strong reasoning", 4: "Advanced reasoning" } return { "raw_score": score, "reasoning_level": rounded_score, "interpretation": reasoning_labels[rounded_score] } # Usage text = "This argument uses multiple sources and evaluates competing perspectives before reaching a conclusion." result = predict_reasoning_level(text, pipe) print(f"Raw score: {result['raw_score']:.2f}") print(f"Reasoning level: {result['reasoning_level']}") print(f"Interpretation: {result['interpretation']}") ``` ### Using the model directly ```python from transformers import AutoModelForSequenceClassification, AutoTokenizer import torch # Load model and tokenizer model_name = "davanstrien/modernbert-reasoning-complexity" model = AutoModelForSequenceClassification.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name) # Prepare text text = "The debate on artificial intelligence's role in society has become increasingly polarized." # Tokenize and predict inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512) with torch.no_grad(): outputs = model(**inputs) # Get regression score complexity_score = outputs.logits.item() print(f"Reasoning Complexity: {complexity_score:.2f}/4.00") ```